key: cord-0164119-2vj4l1yi authors: Huang, Tao; Pei, Youquan; You, Jinhong; Zhang, Wenyang title: A Class of Semiparametric Models with Homogeneous Structure for Panel Data Analysis date: 2022-04-01 journal: nan DOI: nan sha: 57ff0732c428200cda2efe111abc63889e2ef602 doc_id: 164119 cord_uid: 2vj4l1yi Stimulated by the analysis of a dataset from China about Covid-19, we propose a class of semiparametric models for panel data analysis. The proposed models account for both homogeneity and heterogeneity among the individuals of a panel data. They strike a nice balance between parsimony and risk of misspecification. Although stimulated by the analysis of a particular dataset, the proposed models apply to very broad range of panel data analysis, they are powerful in exploring nonlinear dynamic patterns of impacts of covariates or transformed covariates. An estimation procedure is presented, and its asymptotic properties are established. Intensive simulation studies are also conducted to demonstrate how well the estimation procedure works and the risk of ignoring homogeneity or heterogeneity among individuals in panel data analysis. Finally, we apply the proposed models and estimation procedure to the Covid-19 data from China, and reveal some interesting dynamic patterns of the impacts of some important factors. The data, which stimulates this paper, come from 29 provinces (for the sake of convenience, municipalities are also called provinces in this paper) in China. It includes the daily number of Covid-19 infected cases, of people from Wuhan to a province, of cured cases, and maximum daily temperature in the province from 9 January 2020 to 25 March 2020, it also includes the population size of the province. This dataset is a typical panel dataset with each individual of length 77. We focus on the log ratio, y i,t , of the cumulative number of infected cases in the ith province at time point t to that at time point t − 1. We call y i,t the infection ratio at time point t in the ith province. As cured cases may not have much effect on infection ratio, we exclude it from our analysis and focus on maximum daily temperature in a province, denoted by x i,t,1 , and the number of people from Wuhan to the province, denoted by x i,t,2 . What we are interested in are: how maximum daily temperature in a province and the number of people from Wuhan to the province affect the infection ratio in the province? Whether their impacts change over time? If they do, what are the dynamic patterns of the impacts? Whether the impacts vary over different provinces? Are there provinces sharing the same impacts? If there are, which provinces share the same impacts? The varying coefficient models are a powerful tool to explore nonlinear dynamic patterns of impacts of covariates. There is much literature about the varying coefficient models for cross sectional data Zhang, 1999, 2000; Zhang and Lee, 2000; Zhang et al., 2002 Zhang et al., , 2009 Cheng and Fan, 2016; , time series data (Cai et al., 2000; Lei et al., 2016) and panel data (Li et al., 2015 (Li et al., , 2017 Feng et al., 2021, and references therein) . We start with the varying coefficient panel data model, which leads to y i,t = b i + a i,1 (u i,t )x i,t,1 + a i,2 (u i,t )x i,t,2 + i,t , u i,t = t/77. (1.1) Whilst the dynamic patterns of the impacts of x i,t,1 and x i,t,2 have been reflected and formulated by a i,1 (·) and a i,2 (·), this modelling assumes the contribution of covariates to response is linear, functional coefficients though. This may not be realistic, indeed, in Section 5, we will see this assumption is not valid, and the contribution of covariates is in fact through transformed covariates. Furthermore, this modelling has not taken into account the effects of previous infection ratios. To overcome these problems takes us to the following model where b(·), a i,k (·)s and g i,k (·)s are unknown functions, and η i s are random effects. To answer the question that which provinces share the same impacts, we assume some of a i,k (·)s are the same, and some of g i,k (·)s are the same. We don't assume which a i,k (·)s or g i,k (·)s are the same, and let data identify. Putting this modelling idea in a more formal and generic form leads to the following class of semiparametric models for panel data analysis. Rather than confining us to the Covid-19 dataset, we assume y i,t is the observation of the response variable of the ith individual at time point t, (u i,t , X i,t ) the vector of the corresponding covariates, where X i,t = (x i,t,1 , . . . , x i,t,q ) τ is of dimension q, u i,t is a scalar. We assume y i,t = b(u i,t ) + p j=1 a i,j (u i,t )g i,j (y i,t−j ) + q l=1 a i,p+l (u i,t )g i,p+l (x i,t,l ) + η i + i,t (1.3) when t = p + 1, · · · , T i , i = 1, · · · , n. * x i,t,l is the lth component of X i,t , η i a random effect with mean 0 and variance σ 2 η . i,t is a random error with mean 0, variance σ 2 . b(·), a i,j (·)s, and g i,j (·)s are all unknown functions to be estimated. Apparently, (1.3) is not identifiable. To make it identifiable, we assume for i = 1, . . . , n, j = 1, . . . , p and l = 1, . . . , q. We also assume where α k (·)'s and β k (·)'s are unknown functions to be estimated, and {D 1 , . . . , D H } is an unknown partition of {(i, j) : i = 1, · · · , n; j = 1, · · · , p + q}, and so is {∆ 1 , · · · , ∆ m }. Except for the practical implication mentioned before, condition (1.5) also serves, in statistical modelling, to make model (1.3) more parsimonious. * We allow the time series length T i to be heterogeneous across i, which corresponding to unbalanced panel data. We would like to stress that (1.3) is a large class of semiparametric models, which include nonparametric autoregressive models (Huang and Yang, 2004; Sun et al., 2014; Lei et al., 2016; Sun, 2016; Kalli and Griffin, 2018 , and references therein), varying coefficient models and additive models (Wood et al., 2015; Chen et al., 2018; Sang et al., 2020, and references therein). Condition (1.5) is also called a homogeneous structure. Homogeneity pursuit is a very Guo and Li (2022) and references therein. In this paper, we make contributions on three fronts. Firstly, we propose a general class of semiparametric panel data models with homogeneous structure, where dynamic nonparametric panel data models, varying coefficient models and additive models are special cases. Secondly, we develop a three step approach for identifying the latent group structure and estimating group-specific functions. Thirdly, we demonstrate the advantage of the proposed methodology by asymptotic theory and empirical analysis. The rest of the paper is organised as follows. We describe our estimation procedure and the computational algorithm to implement it in Section 2. In Section 3, we present the asymptotic properties of the estimators resulted from either the proposed estimation, overfitting, or underfitting. Overfitting and underfitting will be defined at the beginning of Section 3. Intensive simulation studies are conducted in Section 4 to demonstrate how well the proposed estimation procedure works and the risk of ignoring homogeneity or heterogeneity among the individuals in a panel dataset. In Section 5, we apply the proposed models and estimation procedure to the Covid-19 dataset, and explore how maximum daily temperature in a province and the number of people from Wuhan to the province affect the infection ratio in the province. We will identify which provinces in China share the same impacts, and find out the dynamic patterns of the impacts. All theoretical proofs and technical conditions are put in the supplementary material. Throughout this paper, a superscript τ indicates the transpose of a vector or a matrix. 2 Estimation procedure The proposed estimation procedure consists of three stages: initial estimation, homogeneity pursuit, and final estimation. For the sake of convenience in the homogeneity pursuit in our estimation procedure, without loss of generality, we assume the range of each covariate involved in (1.3) and of We apply B-spline decomposition, with knots being equally placed, to deal with the unknown functions in (1.3). The number of knots used for the final estimation is larger than that for the initial estimation due to the common structure identified being used, therefore, the basis functions for the final estimation are different to that for the initial estimation. Specifically, we use B (·), = 1, · · · , K, to denote the B-spline basis functions used for the final estimation, B (0) (·), = 1, · · · , K 0 , for the initial estimation. Let Under condition (1.4), in the final estimation, where i = 1, . . . , n, j = 1, . . . , p and l = 1, . . . , q. We use superscript (0) to denote the counterpart of this notation in the initial estimation, e.g. B i,0,t is the counterpart of B i,0,t . In the initial estimation, i,p+l . (2.2) We treat all unknown functions in (1.3) as different functions to estimate. Applying the least squares estimation and the approximations in (2.2), we have the following objective i,j s, g i,j s, a i,p+l s and g i,p+l s, and denote the minimisers asb i,j s,g i,j s,ã i,p+l s andg i,p+l , i = 1, · · · , n, j = 1, · · · , p, l = 1, · · · , q, as the initial estimators of b(·) + η i , a i,j (·), g i,j (·), a i,p+l (·), and g i,p+l (·), respectively. Throughout this paper, for any function f (·) on [0, 1], we define We first estimate the partition {∆ 1 , · · · , ∆ m } by using the idea in Vogt and Linton (2017) . We start withg 1,1 (·), and compute . . , n; j = 1, . . . , p + q}. Let∆ 1 be the set of all (i, j)s that satisfy δ ij < C, where C is a given threshold. In practice, it can be selected by cross-validation. We select an element, say (i 0 , j 0 ), from S −∆ 1 , and compute Let∆ 2 be the set of all (i, j)s in S −∆ 1 that satisfy δ ij < C. Continuously doing so, we get {∆ 1 , . . . ,∆m} and use it to estimate {∆ 1 , . . . , ∆ m }. Using exactly the same approach, we can get the estimator {D 1 , . . . ,DĤ} of {D 1 , . . . , D H }. with a ι, , ι = 1, · · · , n, = 1, · · · , p + q, being replaced by α k if (ι, ) ∈D k , g ι, , ι = 1, · · · , n, = 1, · · · , p + q, being replaced by β k if (ι, ) ∈∆ k . To take the within cluster correlation into account in the final estimation, we have to get the estimators of σ 2 η and σ 2 first. We estimate σ 2 η and σ 2 based on the residuals of working independence fitting. Specifically, we minimise with respect to (b, α 1 , . . . , αĤ, β 1 , . . . , βm), and denote the minimiser by (b,α 1 , . . . ,αĤ,β 1 , . . . ,βm). LetM i be M i with b, α k s and β k s being replaced byb,α k s andβ k s. We have the following objective function: Applying the weighted least squares estimation, by simple calculation, we have the objective function: We minimise (2.5) with respect to (b, α 1 , . . . , αĤ, β 1 , . . . , βm), and denote the minimiser by (b,α 1 , . . . ,αĤ,β 1 , . . . ,βm). The final estimators of b(·), a i,j (·), g i,j (·), a i,p+l (·) and g i,p+l (·) arê b(·) = (B 1 (·), . . . , B K (·)) τb , The main hurdle in the implementation of the proposed estimation method is the minimisation of (2.3) and (2.5). Because neither of them has a minimiser with closed form, we are going to minimise them by an iterative approach. Applying the three-step spline estimation method proposed in Hu et al. (2019), we can easily get the initial values for g i,j 's and g i,p+l 's in (2.3). Replacing g i,j 's and g i,p+l 's in (2.3) by their initial values and minimising (2.3) with respect to b i,j 's and a i,p+l 's, we take the resulting minimisers as the initial values for b i,j 's and a i,j 's and a i,j 's and g i,p+l 's, we take the resulting minimisers as the updated values for g i,j 's and i,p+l 's. Continuing this iterative process until convergence, we get the minimiser of (2.3). The minimisation of (2.5) is similar to that of (2.3). We start by choosing the initial values for β k s in (2.5) based ong i,j s andg i,p+l s obtained in Section 2.1.1. Specifically, the initial value for β k is taken to be We replace β k s in (2.5) by their initial values, then minimise (2.5) with respect to b and α k s. We take the resulting minimisers as the initial values for b and α k s and substitute them for b and α k s in (2.5), then minimise (2.5) with respect to β k 's. We take the resulting minimisers as the updated values for β k s, and continue this iterative process until convergence to get the minimiser of (2.5). Remark 1. The estimation of the partition in Section 2.1.2 can be further improved. Taking {∆ 1 , . . . ,∆m} as an example, we can apply the following iterative process to improve the estimation: (1) Computeḡ where |∆ k | is the cardinality of∆ k . (2) For each (i, j) ∈ S, we compute If c k 0 is the smallest c k , then (i, j) belongs to set∆ (1) k 0 . This leads to a partition m } as {∆ 1 , . . . ,∆m}, and repeat (1) and (2). Continuously doing so until convergence, we get an improved estimator of the partition {∆ 1 , · · · , ∆ m }. Remark 2. In the final estimation, an iterative process can be used to further improve the final estimators. Specifically, we substitute the minimiser of (2.5) forb,α k s andβ k s in (2.4), and minimise (2.4). We treat the resulting minimiser as updatedσ 2 η andσ 2 , and substitute them for theσ 2 η andσ 2 in (2.5), then minimise (2.5) and substitute the resulting minimiser forb,α k s andβ k s in (2.4), and minimise (2.4) to get updatedσ 2 η andσ 2 . Continue this iterative process until convergence, and substitute the convergedσ 2 η andσ 2 for theσ 2 η andσ 2 in (2.5), then minimise (2.5) to get the finalb,α k s andβ k s based on which improved final estimators of b(·), a i,j (·), g i,j (·), a i,p+l (·), and g i,p+l (·) are obtained. In this section, we address several practical problems regarding the selection of the values of the tuning parameters in the proposed methods. Following the lead of Hu et al. (2019), we select the number of knots K 0 via a Bayesian information criterion (BIC) approach: The residual sum of squares (RSS) is defined in (2.3), which reflects the goodness of fit. Here, N = (2p + 2q + 1)K 0 controls the complexity of the model, and we assume T i = T for the sake of clarification. The optimal number of knots K 0 can be estimated by minimising the above BIC. According to Theorem 1, it is reasonable to choose the optimal number of knots K 0 on the interval 0.5T 1/5 , 2T 1/5 , where a denotes the largest integer not larger than a. The number of knots K in the final estimation can be selected using the same approach. The homogeneity pursuit method relies on prior information about the threshold C. However, it is usually unknown in practical applications and needs to be determined via a data-driven method. Actually, selecting C is equivalent to selectingĤ andm. In this paper, following the lead of Lian et al. (2021) , we also apply the cross-validation method to select the two tuning parameters,Ĥ andm. In particular, we implement a V -fold cross-validation approach. For a given pair {H, m}, we remove the 1/V th member of the observed time points for {(y i,t , u i,t , X i,t ) , i = 1, . . . , n, t = 1, . . . , T i } as a validation set. We estimate the semiparametric model (1.3) with identified homogeneity structure on the remaining data, compute the squared error between y i,t and the fitted values on the validation set, and repeat this procedure V times to calculate the crossvalidated mean squared error. The optimalĤ andm can be estimated by minimising the cross-validation mean squared error. In this section, we are going to demonstrate the advantage of the proposed methodology by asymptotic theory. We will consider three different approaches: • Over-fitting: Treat all unknown functions in model (1.3) as different functions to estimate. Namely, directly apply the final estimation described in Section 2.1.3 under the assumption that either of the two partitions involved is {{(i, j)} : i = 1, · · · , n; j = 1, · · · , p + q}. • Under-fitting: Directly apply the final estimation to estimate the unknown functions under the assumption a 1,j (·) = · · · = a n,j (·), g 1,j (·) = · · · = g n,j (·), for j = 1, . . . , p + q. • Correct fitting: The proposed estimation is used to estimate the unknown functions. In this section, we assume T i → ∞, n possibly diverges to infinity, but H and m are fixed. This agrees with many real applications where H and m are expected to be small, and thus, there is a significant reduction of the unknown functions by clustering similar functions. To keep the presentation concise, we state the asymptotic theorems in this section and leave all technical proofs in the supplementary material. Let i,x,j,1 , . . . , B Ã i,j (u) is a {(p + q + 1)K 0 }-dimensional vector consisting of (p + q + 1) blocks of length K 0 , with the (j + 1)th block being B (0)τ i (u), others all being 0.G i,j (x) is a {(p + q)K 0 }dimensional vector consisting of (p + q) blocks of length K 0 , with the jth block being we have i,j are of order From Theorem 1, it is easy to see that the convergence rate of the estimatorsb(u), a i,j (u) andĝ i,j (x) are of order T −2/5 i , which is as expected, as we assume that the functions are twice differentiable. Theorem 2 (Under-fitting case). Suppose the functions a i,j (·)s are sufficiently separated, i.e., whereb(u),â i,j (u) andĝ i,j (x) are obtained by underfitting. Theorem 2 shows that the estimators obtained by underfitting are not consistent. Before presenting Theorem 3, we introduce some notations: let A i,j (u) is a {(p + q + 1)K}-vector consisting of (p + q + 1) blocks of length K, with the (j + 1)th block being B τ i (u), others all being 0. G i,j (x) is a {(p + q)K}-vector consisting of (p + q) blocks of length K, with the jth block being B τ i,j (x), others all being 0. Theorem 3 (Correct fitting case). Under the technical conditions (C1)-(C5) in the sup- whereb(u),â i,j (u) andĝ i,j (x) are obtained by correct fitting, and the bias terms r(u), r i,j (u), d i,j (x) are defined as Theorem 1 with order of K −2 . Ξ 1 and Ξ 2 are defined in condition (C5), and Theorem 3 shows that the convergence rates of the estimatorsb(u),â i,j (u) andĝ i,j (x) are of order N −2/5 . This together with Theorem 1 shows that the estimators obtained by correct fitting have convergence rates with a higher order than those obtained by overfitting. Therefore, they are more accurate. Because the asymptotic variance of the above estimators has a very complicated form and it is not clear how to estimate it consistently, constructing a statistical inference based on the asymptotic normality established in Theorem 3 can be very challenging. In this paper, we do not consider the inference problem and leave it as an open question. In this section, we use a simulated example to demonstrate how well the proposed estimation procedure works and the risk of ignoring the homogeneity or heterogeneity among individuals. In particular, the data are generated from the following semiparametric panel data model: where u i,t are generated from a uniform distribution U (0, 1). The exogenous covariates are generated via a first-order autoregressive (AR) process with parameters ρ and σ, denoted by AR(1; ρ, σ), which is given by standard normal random variables. In particular, x i,t ∼ AR(1; 0.6, 0.5), and i,t and η i are independently generated from N (0, 0.1 2 ). The homogeneous coefficient functions and additive functions are generated as follows: b(u) = 1.5 cos(2πu), and for each i = 1, 2, . . . , n, 1.3u sin(2πu) + 1, when i = 1, 2, · · · , n/2, 1.3u cos(2πu) + 1, when i = n/2 + 1, · · · , n. a i,2 (u) = 2 sin(1.5πu) − 1.2(u − 0.5)(1 − u) + 1, and g i,1 (y i,t−1 ) = −0.4(3 − y 2 i,t−1 )/(1 + y 2 i,t−1 ), 2 cos (πx i,t,2 /2) + 1.8 sin (πx i,t,2 /3) , when i = 1, 2, · · · , n/2, 1.5 sin (πx i,t,2 /4) − 1.2 cos (πx i,t,2 /3) , when i = n/2 + 1, · · · , n. For the latent structure functions, Figure 1 describes a toy example of a i,j (·) and g i,j (·) (i = 1, . . . , n and j = 1, 2), where different colours denote different functions. We run the simulated example 100 times with various n and T and compare our proposed approach to its potential competitors based on the following performance metrics: Estimation accuracy For an estimatorâ i,j (·) of a i,j (·), its estimation accuracy can be evaluated based on the mean integrated squared error: To prevent the performance from being dominated by the poor boundary behaviour, we let the integral domain be a non-boundary region, which is between the 1st and 99th quantiles of {u it }. Homogeneity pursuit consistency To evaluate the distance between the detected homogeneous structure and the true one, we use the normalised mutual information (NMI) (Ke et al., 2016) , which measures the similarity between two partitions. Suppose that where and The 1, . . . , n, j = 1, . . . , p + q}, obtained in stage 2 of the proposed estimation procedure in Section 2.1, we calculate NMI(D, D) to assess how close the true homogeneous structure in a i,j (·) is to the estimated one. Similarly, we can calculate NMI(∆, ∆) to assess how close the true homogeneous structure in g i,j (·) is to the estimated one. The sample size T i = 100, 200 or 400 and n = 20 or 40. We estimate the unknown functions by overfitting, correct fitting and underfitting, respectively. The mean and standard deviation of NMI for the resulting estimators are presented in Table 1 and the mean squared errors of the resulting estimators are presented in Table 2 . The initial estimates for subject-specific varying-coefficient functions a i,j (·) and additive functions g i,j (·) are presented in Figures 2 and 3 . From these figures, we can see that the initial estimates for a i,1 (·) and g i,2 (·) have an obvious group structure, which is a preliminary validation of our simulation setup. Based on the initial estimation, we can identify the latent group structure using the procedure in Section 2.1.2. Note that in our simulation, the true partition for both a i,j (·) and g i,j (·) has three groups. Figures 4 and 5 present one simulation result of this step, which is aligned with our simulation setting. For example, in Figure 4 , the left-hand side shows the initial estimates for a i,j (·), the middle is the L 2 distance matrix calculated based on a i,j (·) and the right-hand side is the result of the homogeneity pursuit, which has three groups. The group membership is denoted by the colours. 13 17 11 15 16 12 20 18 14 19 2 3 1 8 10 7 5 4 6 9 39 37 23 36 32 33 22 34 26 31 38 25 21 30 28 35 24 29 27 The mean and standard deviation of the NMIs for a i,j (·) and g i,j (·) are shown in Table 1 . We can observe that as T gets larger, the performance becomes better. This makes sense since the initial estimates improve as the sample size T increases. We have illustrated the performance of homogeneity pursuit. We now assume that the true group structure is known in this step and adopt the procedure in Section 2.1.3 to estimate the group-specific functions. The final estimates are shown in Figures 6 and 7 . In particular, we compare the estimation performance by plotting the final estimates and the Finally, in Table 2 , we compare the performance of correct fitting with overfitting and underfitting. We can observe that among these three methods, correct fitting provides the best performance with the lowest MISE. For correct fitting and overfitting, the overall estimation accuracy improves as T increases, which is somewhat expected, since as T increases, the initial estimates are better and the subsequent procedures rely more on the initial estimates. Note that the functions estimated by underfitting are not consistent, Note that overfitting and underfitting, which either ignore or mistakenly specify the homogeneous structure, perform worse than correct fitting, which illustrates the importance of incorporating the homogeneous structure in panel data analysis. The growth rate of cumulative confirmed cases (GRCC) is the response variable. It is denoted by y i,t and is measured by log (Z i,t ) − log (Z i,t−1 ). † To explore how the influential factors contribute to the GRCC, we focus on two covariates: (1) the proportion of the population of Wuhan travelling to the ith province on day t − 14, denoted by x i,t,1 ; (2) the maximum daily temperature in the ith province on day t, denoted by x i,t,2 . We normalise the covariates x i,t,1 and x i,t,2 to mean 0 and variance 1 first, then apply the following time-varying-coefficient additive model to fit the data where i = 1, . . . , 29 and t = 1, . . . , 77. The proposed estimation procedure is implemented to estimate the unknown functions in the model, and explore how the maximum daily temperature in a province and the number of people who travelled from Wuhan to the province affect the infection ratio in the province. We also identify the provinces which share the same impacts of the two influential factors, and the dynamic patterns of the impacts. Specifically, we first apply the proposed method to estimate the subject-specific coefficients, the obtained initial estimates of time-varying coefficients and additive functions are presented in Figure 8 . We can see that bothâ i,j (·) andĝ i,j (·) have group structures, which verifies the necessity of considering a latent group structure model to characterise this homogeneity. A question naturally arises: How many groups are there and which provinces share similar impacts? We calculate the L 2 distance between different functions, and identify the latent group structure based on the proposed method in Section 2.1.2. The obtained results are presented in the supplementary material. For each identified group, we plot the initial estimates of † The time series plot of y i,t for 29 provinces in China from 23 January to 8 April is presented in the supplementary material. belonging to the same group have similar patterns, which illustrates that our proposed homogeneity pursuit approach performs well. Finally, based on the identified group structure, we estimate b(·), α k (·) and β k (·) according to the procedure proposed in Section 2.1.3. The results are presented in Figure 11 . We summarise the group-specific functions and the corresponding functional changes in Table 3 . Group 1 The coefficients first increases, then decreases and finally increases to zero. Group 2 The coefficients first decreases, then increases and finally decreases to zero. The coefficients first increases and then decreases to zero. The coefficients first decreases and then increases to zero. The coefficients increases from negative to zero gradually. The coefficients decreases from positive to zero gradually. Group 1 The function decreases from positive to negative. The function increases gradually and then remain the same. The function first decreases, then increases and finally remain the same. The function first decreases, then increases. This paper has presented a mechanism for identifying and estimating latent group structures in a class of semiparametric panel data models in which the varying coefficients and additive functions are heterogeneous across groups but homogeneous within a group and (c) Figure 11 : (a) estimated trend function b(·). (b) estimated group-specific varying-coefficient functions α k (·), k = 1, · · · , 6. (c) estimated group-specific additive functions β k (·), k = 1, · · · , 4. the group membership is unknown. In particular, we have considered identifying the group membership by a distance-based approach and estimating the group-specific functions with a three-stage method. We also established the asymptotic properties for the estimators obtained by overfitting, underfitting and correct fitting. The results of the simulations and an analysis of real data were provided to illustrate the finite-sample performance of the proposed method. Functional subspace clustering with application to time series Grouped patterns of heterogeneity in panel data Functional-coefficient regression models for nonlinear time series Estimating latent group structure in time-varying coefficient panel data models Error variance estimation in ultrahigh-dimensional additive models Peter hall's contributions to nonparametric function estimation and modeling Forward variable selection for sparse ultra-high dimensional varying coefficient models Statistical estimation in varying coefficient models Simultaneous confidence bands and hypothesis testing in varying coefficient models Varying coefficient panel data model with interactive fixed effects Homogeneity and structure identification in semiparametric factor models Estimation and identification of a varying-coefficient additive model for locally stationary processes Identification of non-linear additive autoregressive models Bayesian nonparametric vector autoregressive models Structure identification in panel data analysis High dimensional dynamic covariance matrices with homogeneous structure Estimation of semivarying coefficient time series models with arma errors Model selection and structure specification in ultra-high dimensional generalised semi-varying coefficient models A functional varying-coefficient single-index model for functional response data Subgroup identification via homogeneity pursuit for dense longitudinal/spatial data When will the covid-19 pandemic peak Homogeneity pursuit in single index models based panel data analysis Panel forecasts of country-level covid-19 infections Network-based clustering for varying coefficient panel data models Tuning-free heterogeneous inference in massive networks Estimation of sparse functional additive models with adaptive group lasso Identifying latent structures in panel data Sieve estimation of time-varying panel data models with latent structures Functional-coefficient spatial autoregressive models with nonparametric spatial weights A semiparametric spatial dynamic model The interplay of demographic variables and social distancing scores in deep prediction of us covid-19 cases Classification of non-parametric regression functions in longitudinal data models Homogeneity pursuit in panel data models: Theory and application Generalized additive models for large data sets Statistical insights into deep neural network learning in subspace classification Homogeneity structure learning in large-scale panel data with heavy-tailed errors A new multilevel modelling approach for clustered survival data A semiparametric model for cluster data Variable bandwidth selection in varying-coefficient models Local polynomial fitting in semivarying coefficient model