key: cord-0655789-ztxsmrhq authors: Naderi, Mehrdad; Mirfarah, Elham; Bernhardt, Matthew; Chen, Ding-Geng title: Semiparametric inference for the scale-mixture of normal partial linear regression model with censored data date: 2020-11-15 journal: nan DOI: nan sha: 11272a9ef1773faa79a03af484b7ad33260eb217 doc_id: 655789 cord_uid: ztxsmrhq In the framework of censored data modeling, the classical linear regression model that assumes normally distributed random errors has received increasing attention in recent years, mainly for mathematical and computational convenience. However, practical studies have often criticized this linear regression model due to its sensitivity to departure from the normality and from the partial nonlinearity. This paper proposes to solve these potential issues simultaneously in the context of the partial linear regression model by assuming that the random errors follow a scale-mixture of normal (SMN) family of distributions. The proposed method allows us to model data with great flexibility, accommodating heavy tails, and outliers. By implementing the B-spline function and using the convenient hierarchical representation of the SMN distributions, a computationally analytical EM-type algorithm is developed to perform maximum likelihood inference of the model parameters. Various simulation studies are conducted to investigate the finite sample properties as well as the robustness of the model in dealing with the heavy-tails distributed datasets. Real-word data examples are finally analyzed for illustrating the usefulness of the proposed methodology. Regression models form the basis for a large number of statistical inference procedures. The main purpose of regression analysis is to explore the relationship between a continuous response variable Y and a p-dimensional covariate vector x ∈ R p . More precisely, the regression models aim to find the expected value of Y for a given level of covariate vector x, say E(Y |x) = g(x). These models can be found in a broad array of scientific fields, including econometrics, engineering and medical studies, and allow judgments to be made on data within these fields. The parametric regression models in which the function g(·) can be specified in terms of a small number of parameters are widely used for data exploration since the parameters can be interpreted as the effects of covariates on the response variable. For instance, the classical linear regression model assumes that Y = β ⊤ x + ǫ, where ǫ represents the error term followed by a normal distribution and β ⊤ = (β 0 , β 1 , . . . , β p−1 ). However, the nonlinearity (or partial nonlinearity), as well as invalid distribution assumptions, might increase the model misspecification and misleading inference. In this regard, the semiparametric techniques can provide an alternative platform for data analysis. One of the most acknowledged semiparametric models in the regression framework is the partially linear regression (PLR) model. The PLR model assume that the response variable Y not only has linear relationship with certain covariates but also is regressed by another covariate z with unknown smooth function. More concretely, the PLR model allows both parametric and nonparametric specifications in the regression function and can be written as (1) where ψ(·) is an unknown smooth function. It can be observed from (1) that the covariates are separated into parametric components x and nonparametric component z. The parametric part of the model can be interpreted as a linear function, while the nonparametric part frees the rest of the model from any structural assumptions. As a result, the estimator of β is less affected by the model bias. Since the introduction of PLR model by [1] , it has received attention in economics, social and biological sciences. See for instance [2, 3, 4, 5] and [6] among others. Moreover, various alternative approaches to the penalized least-squares method, considered by [1] , were proposed for estimating the PLR models. For instance, [7] exploited the profile least squares estimator for β and the Nadaraya-Watson kernel estimate [8, 9] for the unknown function. [10] also proposed a quasi-likelihood estimation method. However, to the best of our knowledge, none of the previous works consider specific distributions on the error term. In this article we propose three aspects. The first and main objective of this contribution is to develop a PLR model in which the error term is followed by an SMN distribution. The SMN family of distributions is an extension of the normal model with fat tails. It contains the student-t, slash, Laplace and contaminated normal as the special cases. Comprehensive surveys of the SMN family of distributions are available in [11, 12] and [13] , among others. Our PLR model secondly implements the basis splines estimator as a powerful nonparametric approach of kernel estimation. As discussed in Section 2, the spline functions are piecewise polynomial functions where the weights in the sum are parameters that have to be estimated. The basis functions of the spline will allow us to build a flexible model across the whole range of the data. The regression models with censored dependent variable have been considered in fields such as econometric engineering analysis, clinical essays, medical surveys, among others. Lastly our PLR model with B-spline consideration is enriched by assuming the interval-censoring scheme on the response variable, as an extension of [12, 13] . The rest of the paper is therefore organized as follows. In Section 2, we provide a brief overview on the SMN family of distributions as well as B-spline functions. Section 3 then presents the scale-mixture of normal partial linear regression model with interval-censored data, hereafter PLR-SMN-IC model. We also discuss on the implementation of a expectation-conditionally maximization either(ECME) algorithm, proposed by [14] , for obtaining the maximumlikelihood (ML) parameter estimates in Section 3. Simulation studies are conducted in Section 4 to examine the properties of the proposed methodology. Finally the superiority of our model is illustrated in Section 5 by analyzing two real-world datasets. Concluding remarks and possible directions for future research are discussed in Section 6. In this section we briefly review the SMN family of distributions and polynomial spline basis function. For the sake of notation, let φ(·; µ, σ 2 ) and Φ(·; µ, σ 2 ) represent the probability density function (pdf) and cumulative distribution function (cdf) of a normal distribution with mean µ and variance σ 2 , denoted by N (µ, σ 2 ). Formally, the scale-mixture of normal (SMN) distribution is generated by scaling the variance of a normal distribution with a positive weighting random variable U . More specifically, a random variable V is said to have a SMN distribution, denoted by SMN (µ, σ 2 , ν), if it is generated by the stochastic linear representation where Z ∼ N (0, σ 2 ), U is a scale-mixing factor with the cdf H(·; ν), indexed by the parameter ν, and the symbol ⊥ indicates independence. Referring to (2) , the hierarchical representation of the SMN distribution is Accordingly, the pdf of the random variable V can be expressed as In what follows, f SMN (·; ν) and F SMN (·; ν) will be used to denote the pdf and cdf of the standard SMN distribution (µ = 0, σ 2 = 1). Depending on the choice of H(·; ν), a wide range of distributions can be generated using (3) . We focus on a few commonly used special cases of the SMN family of distributions in this paper: • Normal (N) distribution: The SMN family of distributions contains the normal model as U = 1 with probability one. • Student-t (T) distribution: If U ∼ Gamma(ν/2, ν/2), where Gamma(α, β) represents the gamma distribution with shape and scale parameters α and β, respectively, the random variable V then follows the Student-t distribution, V ∼ T (µ, σ 2 , ν). For ν = 1 the Student-t distribution turns into the Cauchy distribution which has no defined mean and variance. • Slash (SL) distribution: Let U in (2) follows Beta(ν, 1), where Beta(a, b) signifies the beta distribution with parameter a and b. Then, V distributed as a slash model, denoted by V ∼ SL(µ, σ 2 , ν), with pdf • Contaminated-normal (CN) distribution: Let U be a discrete random variable with pdf where I A (·) represents the indicator function of the set A. The random variable V in (2) then follows the contaminated-normal distribution, V ∼ CN (µ, σ 2 , ν, γ), which has the pdf Note that in the pdf of CN distribution, the parameter ν denotes the proportion of outliers (bad points) and γ is the contamination factor. More technical details and information on the SMN family of distributions, used for the calculation of some conditional expectations involved in the proposed EM-type algorithm, are provided in the Appendix A with proof in [12] . The basis spline function, or B-spline in short, is a numerical tool that was originally introduced by [15] and recently received considerable attention in the statistical analysis of density estimation. For a given degree, smoothness and domain partition, the B-spline function provides a sophisticated approach to approximate the unknown function ψ(·). To approximate ψ(x) on the interval [a, b], any spline function of d degree on a given set of m interior knots, say a = x 1 < x 2 < · · · < x m+2 = b, can formally be represented as a unique linear combination of B-splines: where the B-spline of d degree is defined recursively by Although the piecewise linear approximation, case with d = 1, is attractively simple, it produces a visible roughness, unless the knots x i are close to each other. Theoretically, a B-spline function of degree d with m knots should be a (d − 1) continuously differentiable function, in each of the intervals (a, x 2 ), (x 2 , x 3 ), . . . , (x m+1 , b). This condition may complicate the approximation issue. However, practical studies confirm that quadratic and cubic B-spline functions usually provide robust platforms that have the minimal requirement, ψ(x) should be twice continuously differentiable smooth function. Comprehensive review on the theory of spline function can be found in [16] . In this paper, we consider the cubic B-spline function, i.e. d = 3. The number of interior knots is also chosen by either m 1 = [n 1/3 ] + 1 or m 2 = [n 1/5 ] + 1, where [a] denotes the largest integer smaller than a and n is the sample size. For the locations of knots, two scenarios are considered: the equally-spaced (ES), and the equally-spaced quantile (ESQ). As investigated in our simulation studies, our strategy works well under these assumptions. However, for the practical studies if a large number of knots is required and there is not enough data on the boundary, we may need to obtain the knots through an unequally spaced technique to avoid singularity problems [17] . 3 Proposed methodology In this section, we consider the partial linear regression model where the random error follows an SMN family of distributions. Let Y = {Y 1 , . . . , Y n } be a set of response variables and x i a vector of explanatory variable values corresponding to Y i . The PLR model based on the SMN distribution is defined as where iid represents the independent and identically distributed, z i is a univariate covariate such as the confounding factor, and ψ(·) is an unknown smooth function playing the role of the nonparametric component. Clearly we can Censored time-to-event data is widely seen in health studies, such as, time-to-death in cancer research, time-toinfection in HIV, TB and COVID-19 studies. Among all the censored data, the interval-censored data is the most general type of data, which covers the typical right-censored and left-censored data as special cases. Interval-censored data can also be generated from other science fields, such as detection limits of quantification in environmental, toxicological and pharmacological studies. Therefore, we focus on interval-censored data in this paper. We assume that the set of joint variables {W i , ρ i } are observed where W i represents the uncensored reading the left-censoring (or right-censoring) is occurred and in the case −∞ = C i1 < C i2 = +∞ the interval-censored realization is observed. We establish our methodology based on the interval-censoring scheme, however, the left/right-censoring schemes are also investigated in the simulation and real-data analyses. We will refer to the PLR model of censored data based on the special cases of the SMN family of distributions as PLR-N-IC, PLR-T-IC, PLR-SL-IC and PLR-CN-IC for the N, T, SL and CN cases, respectively. For ease of exposition and based on the aforementioned setting, one can divide Y to the sets of observed responses, Y o , and censored cases. We can therefore view Y as the latent variable since it is partially unobserved. Under these assumptions, the log-likelihood function for Θ = (β ⊤ , σ 2 , ν) of the PLR-SMN-IC model can be written as where µ i = β ⊤ x i + ψ(z i ), ρ = (ρ 1 , . . . , ρ n ) ⊤ are the censoring indicators, and w = (w 1 , . . . , w n ) ⊤ denote the realizations of W = (W 1 , . . . , W n ) ⊤ . Due to complexity of the log-likelihood function (6), there is no analytical solution to obtain the ML estimates of parameters and the smooth function. A numerical search algorithm should therefore be developed. With the embedded hierarchical representation (3) and B-spline function, an innovative EM-type algorithm is developed to calibrate the PLR-SMN-IC model to the data. In this section, an EM-type algorithm is developed for calibrating the PLR-SMN-IC model to the data. In order to do this, we first replace the basis expansion of ψ(z i ) defined in (4) into the model (5) . Immediately, we can obtain where For convenience, the obtained model in (7) can be rewritten as . . , n. Now using (3), the hierarchical representation of the PLR-SMN-IC model is For the realizations y = (y 1 , . . . , y n ) ⊤ and the latent values u = (u 1 , . . . , u n ) ⊤ , the log-likelihood function for Θ = (β ⊤ , σ 2 , ν) associated with the complete data y c = (w ⊤ , ρ ⊤ , y ⊤ , u ⊤ ) ⊤ , is therefore given by where h(·; ν) is the pdf of U i and c is an additive constant. We then develop an expectation conditional maximization either (ECME; [14] ) algorithm to estimate parameters from the PLR-SMN-IC model. As an extension of expectation conditional maximization (ECM; [18] ) the ECME algorithm has stable properties (e.g. monotone convergence and implementation simplicity) and can be implemented faster than ECM. The ML parameter estimates via ECME algorithm are obtained by maximizing the constrained Q-function with some CM-steps that maximize the corresponding constrained actual marginal likelihood function, called CML-steps. The ECME algorithm for ML estimation of the PLR-SMN-IC model proceeds as follows: • Initialization: Set the number of iteration to k = 0 and choose a relative starting point. • E-Step: At iteration k, the expected value of the complete-data log-likelihood function (8), known as the Q-function, is computed as where uy 2 In what follows, we discuss the computation of conditional expectations for both uncensored and censored cases. (i) For the uncensored observations, we have ρ i = 0 and so,û The closed form of the conditional expectations for the particular cases of the SMN family of distributions are provided in Appendix A. • CM-step: The M -step consists of maximizing the Q-function with respect to Θ (k) . The maximization of (9) overβ and σ 2 lead to the following CM estimators: • CML-step: The update of ν crucially depends on the conditional expectationΥ which is quite complicated. However, we can update ν through maximizing the actual log-likelihood function aŝ The R function nlminb(·) is used to update ν in the numerical parts of the current paper. To facilitate the update of ν = (ν, γ) for the PLR-CN-IC model in the above ECME algorithm, one can introduce an extra latent binary variable B i such that B i = 1 if an observation y i is a bad point (outlier) and B i = 0 if y i is a good point. The hierarchical representation of the PLR-CN-IC model can therefore be written as where B(1, ν) denotes the Bernoulli distribution with success probability ν. Consequently, by computing the Qfunction based on (11), the update of ν isν (k+1 , for the uncensoed cases, , for the censoed cases. Since there is no closed-form solution forγ (k+1) , we update γ by maximizing the constrained actual observed loglikelihood function (10) as a function of γ. The choice of starting points plays a critical role speeding up parameter estimation via the EM-type algorithm and to guarantee reaching stationarity in the ML solutions. As a convenient approach to generate sensible initial values, we fit a classical linear model to data and find the estimate of β. To do this, the R function "lm(·)" is used. We also set σ 2 as the average of squared residuals of the classical linear model. The obtained parameter estimates of β and σ 2 are used as the initial points for implementing PLR-N-IC model. By calibrating the PLR-N-IC model to the data, the parameter estimates are exploited as the starting values for the PLR-T-IC, PLR-SL-IC, PLR-CN-IC models. We adapt the scale mixture factor parameterν (0) so that it corresponds to an initial assumption near normality. For instance, we setν (0) = 20 in the PLR-T-IC, PLR-SL-IC models. The process of the EM algorithm can be iterated until a suitable convergence rule is satisfied. Herein, the incremental likelihood property of EM-type algorithm is used to detect the convergence. We terminate the algorithm either when the maximum number of iterations K max = 2000 has been reached or where ℓ(·) is the log-likelihood function defined in (6) and ε is a user specified tolerance. In our study, the tolerance ε is considered as 10 −5 . The models in competition in our data analysis are compared using the most commonly used information-based measures. Following [17] , we vary the number of knots in a relatively large range and choose the one which minimizes the Akaike information criterion (AIC) or Bayesian information criteria (BIC) defined as where m, d, p and s denote, respectively, the number of knots, degree of the spline, number of covariates and number of parameters of the scale mixing factor U , and ℓ max is the maximized log-likelihood value. Models with lower values of AIC or BIC are considered more preferable. It should be noted that m ranging from 3 to 10 are usually adequate and the results are quite stable when we vary the number of knots. In this section, four Monte-Carlo simulation studies are conducted in order to verify the asymptotic properties of the ML estimates, to assess the fitting performance of the model, and to check the robustness of the proposed model in dealing with highly peaked, heavily tailed data as well as its robustness in presence of outliers. For the sake of data generation, it should be noted that one of the simplest ways of interval-censored data generation is to consider are two independent continuous variables followed by U(0, c) such that the non-informative condition (1.2) of [19] is fulfilled. Here U(a, b) represents the uniform distribution on interval (a, b). Recommended by [19] , a way to go around non-informative condition is to construct with c = 1, which can be shown that fulfills the non-informative condition. Let y = (y 1 , . . . , y n ) ⊤ be the n realizations from model (5) . To have a p% interval-censored dataset, the following steps are used in our simulation studies. 1) Compute the number of censored samples N C = [n × p] + 1 and then, generate an index set, IN D, as a sample of size N C from {1, 2, · · · , n} without replacement by using the R function sample( ). 2) For i = 1, . . . , n, if i ∈ IN D, we then generate two independent random variables, U 1 i and U 2 i , independently from U(0, c). Finally, we have To assess the performance of the ECME algorithm for obtaining ML estimations, we conduct an extensive simulation study under various scenarios. We generate the response variable from the following model where ǫ i is generated from either the N, T, SL or the CN distributions for various sample sizes ranging from 50 to 800, ψ(z) = exp(z/3) − 1, and β = (1, 2, −2). We also consider 1) and z i ∼ U (−1, 2) . Furthermore, we set σ 2 = 2, ν = 3 for the T and SL distributions and ν = 0.4 and γ = 0.3 for the CN model. The considered levels of interval-censoring are 7.5% and 30%. In each replication of 500 trials, we fit the special cases of the PLR-SMN-IC model to the data under both assumptions of the number of interior knots m 1 and m 2 explained in Section 2. The locations of knots are also chosen based on the ESQ scenario, even though some results are observed based on the ES method. Table 1 depicts the average values of the AIC and BIC across all generated samples. As can be expected, the values of AIC and BIC are increased by exceeding the percentage of censoring. Results in Table 1 suggest that the chosen optimal number of interior knots depends on the level of censoring and the considered model. For instance, the BIC values of the PLR-T-IC model highlight the outperformance of the m 2 approach. On the other hand, for the PLR-CN-IC model, one can conclude that the optimal number of interior knots for a 7.5% censoring level is obtained via m 1 and for 30% via m 2 . The number of outperformance of the m 2 approach is however significantly higher than m 1 in this simulation study. We therefore only focused on the m 2 approach in what follows of this simulation to shorten the length of the paper. To investigate the estimation accuracies, we compute the bias and the mean squared error (MSE): whereθ j denotes the estimate of a specific parameter at the jth replication. In addition, we are interested in examining the accuracy of the ψ(z) estimation in terms of the averaged integrated absolute bias (IABIAS) and mean integrated square error (MISE) defined as , is the estimate of ψ(·) in the jth replication. Figure 1 plots the BIAS and MSE of the regression parameters (β), σ 2 and the IABIAS and MISE of the estimated ψ(z) as a function of sample sizes for two levels of censoring. It can be observed that the estimates of the β i s have very small (around zero) BIAS for all sample sizes. Moreover, as n increases the BIAS of σ 2 and the IABIAS of the estimated ψ(z) tend to zero. The plots in Figure 1 also reveal that the MSE of the parameters and MISE of the estimated ψ(z) tend to zero when the sample size is increased. These results indicate that our estimator of the regression parameters and the estimation of ψ(z) is rather accurate. In this Monte-Carlo simulation experiment, the flexibility of the proposed PLR-SMN-IC model to modeling data generated form the some other distributions, is investigated. We generate 200 samples of size 400 from the rightcensored linear regression model, where β ⊤ = (β 0 , β 1 , β 2 , β 3 ) = c(1, 2, −2, 1), σ 2 = 2, x 1 ∼ N (0, 1), x 2 ∼ B(1, 0.5), x 3 ∼ U(−2, 2) and z ∼ U (−1, 7) . The considered levels of censoring are 7.5%, 15%, and 30%. We generate the error terms under the three different scenarios from the SMN representation (2) . In the first scenario the random variable U in representation (2) is generated from the exponential distribution with mean 2, where as the Birnbaum-Saunders (BS) and generalized inverse Gaussian (GIG) models are used for the second and third scenarios, respectively. The parameters are set to β = 1 (scale parameter) and ν = 2 (shape parameter) for the BS distribution, and to (κ, χ, ψ) = (−0.5, 1, 3) for the GIG model. Since the error term under these scenarios follows the Laplace, BS and GIG scale-mixture distributions, the notations PLR-L-IC, PLR-SBS-IC and PLR-SGIG-IC are used to denote the PLR model under these models. Bare in mind that the considered non-normal data generation offers the desired level of leptokurtosis. We also note that the PLR-L-IC, PLR-SBS-IC and PLR-SGIG-IC models are not considered in this paper since their conditional expectations involved in the ECME algorithm are not exist. For each sample, we fit the PLR-N-IC, PLR-T-IC, PLR-SL-IC and PLR-CN-IC models to the data by assuming both m 1 and m 2 methods for choosing the number of knots, as well as ES and ESQ methods for the knots position. Table 2 depicts the average values of AIC and BIC measures over the 200 trials. Not surprisingly, under different levels of censoring, number of knots and position of the knots, all criteria favour PLR censored models based on heavy tail distributions. As highlighted in Table 2 , the PLR-T-IC model is the best in almost all cases. Furthermore, the outputs in this table suggest that m 2 preforms significantly better than m 1 . However, it can be seen that there is no large difference in the AIC and BIC values between the two methods of knot position for all models, indicating their robustness against the position of knots. In this section, we consider the following left-censored PLR model with censoring levels 10%, 20%, where the nonparametric component ψ(z i ) has the form 3 sin(2z i ) + 10ξI (0,0.1) (z i ) + ξI (0.1,∞) (z i ), in which ξ is set to vary among 0, 3 and 6. Figure 2 shows the plot of ψ(z) in which the jump in the function for different values of ξ can be observed. We assume that the realizations (x 1i , z i ) are jointly generated from a bivariate normal distribution with mean zero, variance one and correlation coefficient ρ = 0.5. The error term ǫ i is also drawn from a standard normal model. In this experiment, we are interested in predicting the censored observations, y c i , through computing the expectationŷ c i = E(Y |w i , ρ i ,Θ) at the last iteration of the ECME algorithm. Moreover, to check the influence of noise points in model performance and imputation of censored observations, some noisy points simulated from U(−5, 5), U(−3, 2), and U(−2, 8) are added, respectively to y and x 1 and z. Table 3 indicate that the value of MAE increases as the number of censored observations is increased. As can be expected, the PLR-N-IC model preforms well when the number of noise realizations is zero. However, adding the noise points reduces its flexibility in both fitting data and predicting censored observations. It can be observed that only the MAE of the PLR-N-IC model increases as the noise points are added to the datasets, showing its lack of robustness in dealing with the contaminated data with In this Monte-Carlo experiment, we are interested in comparing the performance of the parameter estimates in the presence of outliers on the response variable. We simulate 500 Monte Carlo samples from the interval-censored PLR model (5) of size n = 300, while the errors are randomly generated from the normal distribution with scale parameter σ 2 = 2. The imposed censoring levels are 10%, 20% and 30%. We set β ⊤ = (1, 4, 2) and generate the covariates To assess relative changes on the parameter estimates by the presence of outliers, the mean magnitude of relative error (MMRE; [21] ), defined is calculated as follows: Curves of the average MMRE as a function of contamination level δ are displayed in Figure 3 . It could be seen that the influence of the outliers in parameter estimation increases as δ increases for all models. As one would expect, the heavy-tailed models, such as PLR-T-IC and PLR-SL-IC, are less adversely affected, showing their robustness against the presence of outliers. On the other hand, an extreme observation seems to be much more effective on the PLR-N-IC, reflecting a lack of ability to reduce the influence of outliers. In the following, we present two applications of the PLR-SMN-IC model to real data for illustrative purposes. The first real dataset corresponds to the young married women's labor force participation using the data extracted from the Canadian Survey of Labour and Income Dynamics (SLID). The complete data, reported in [22] , consists of 6900 respondents of the married women aging from 18 and 65 years. However, the samples with missing data (near to 7%) are removed from the study. For each 6340 women in the remaining sample, four measures, namely working hours, family income, age and education, are recorded. By way of illustration, we consider the PLR model as where y is the working hours scaled by 1000, x 1 age, x 2 family income scaled by 1000 and z as education. We note that among 6340 women in the sample, only 4398 represent propensity to work outside the home, i.e. if that propensity Table 4: Model performance and evaluation of the prediction accuracy for the PLR-N-IC, PLR-T-IC, PLR-CN-IC and PLR-SL-IC models. PLR-N-IC PLR-T-IC PLR-CN-IC PLR-SL-IC Data Criterion is above the threshold 0, we observe positive hours worked. Therefore, the response variable is left-censored at the value 0. By way of the second illustration, we consider extramarital Affairs data that is available on the "AER" packages of R. The Affairs dataset, originally reported in [23] , is recently re-analyzed by [24] in the censored regression framework. The variables involved in the study of the Affairs dataset were: the number of observations engaged in extramarital sexual intercourse during the past year (y), number of years married (x 1 ), occupation according to Hollingshead classification (x 2 ), self rating of marriage (x 3 ) and the age as the nonparametric component z. Recommended by [24] , the response variable is highly left-censored at zero (451 out of 601, or 75%) which may have significant effects on the coefficients. We fit the PLR-N-IC, PLR-T-IC, PLR-CN-IC and PLR-SL-IC models to each of the datasets by implementing the proposed ECME algorithm in Section 4. We use both m 1 and m 2 approaches for choosing the number of interior knots. Exploiting the ES and ESQ methods for the location of the knots suggest that the ESQ has better performance for analyzing these datasets. Table 4 presents the maximized log-likelihood function, and the values of AIC and BIC for all fitted models with both m 1 and m 2 number of interior knots. It can be observed that the PLR-SMN-IC models with heavy tails have better fit than the PLR-N-IC model. Results based on AIC and BIC indicate that the PLR-SL-IC and PLR-T-IC models provide a highly improved fit of the data over other models, for the SLID and Affairs datasets respectively. Table 5 shows the parameter estimates of the best fitted PLR-SMN-IC sub-models with respect to the number of knots, for the SLID and Affairs datasets. It can be seen that all models offer similar estimates for the slope parameters (β 1 , β 2 and β 3 ). Moreover, the parameter estimate ν is significantly different from zero and has large values for the PLR-T-IC, PLR-CN-IC, and PLR-SL-IC models, indicating the departure of the data from the normality assumption. Finally, we display in Figure 4 the estimated curve of the nonparametric component ψ(z) for the fitted PLR-SL-IC and PLR-T-IC models to the SLID and Affairs datasets, respectively. This paper proposed a semiparametric inference for partially linear regression model with interval-censored responses where the SMN family of distributions is assumed for the error term. The new model provided an alternative benchmark for the conventional choice of the normal distribution. Our proposed model extended the recent works by [12, 13] to the partially linear regression framework that, to the best of our knowledge, there are no previous studies of a likelihood-based perspective related to this topic. Using the basis spline function, B-spline, we defined the pseudo covariates and pseudo regression parameter. We then exploited the hierarchical representation of the SMN class of distributions for developing a feasible ECME algorithm to obtain the ML parameter estimates. We conducted four simulation studies to examine the performance of the proposed model and its parameter estimation. Specifically, simulation studies aimed at checking the ability of the new methodology in parameter recovery, model comparisons and sensitivity analysis in presence of noise points and a single outlier. Finally, two real-world datasets, SLID and Affairs, were analyzed for illustrative purposes. As was reported, the heavy-tailed PLR-SMN-IC models, such as PLR-T-IC, PLR-CN-IC and PLR-SL-IC, presented better results than the PLR-N-IC model to these real data examples. All computations were carried out by R language and the computer program is available from the first author upon request. The methodology presented in this paper can be extended through the following open issues: • Motivated from the SLID dataset, it is interesting to formulate a PLR model to simultaneously handle missing and censored observations [25, 26] . • Motivated from the Affairs dataset and recommended by [24] , one can construct a PLR-SMN model for analyzing doubly-censored data. • Two important questions that might be raised are: how can we handle multivariate covariates in the nonparametric part? and how can we select the vector of explanatory covariates for both linear and nonparametric parts?. The former issue can be addressed through either the single-index model, generalized additive model or multivariate B-spline function, whereas the latter can be answered by variable selection studies. • As future research, we are exploring in building a finite mixture of semiparametric partially linear regression models as an extension of [27, 28] in both likelihood and Bayesian context [29, 30] . • If Y ∼ T µ, σ 2 , ν , we haveû =ν + 1 ν + δ y,μ,σ , where δ(y, µ, σ) = (y − µ)/σ 2 . • If Y ∼ SL µ, σ 2 , ν , thenû = 2 δ y,μ,σ −1 Γ ν + 1.5, 0.5δ y,μ,σ Γ ν + 0.5, 0.5δ y,μ,σ . • If Y ∼ CN µ, σ 2 , ν, γ , we havê u = 1 −ν +νγ 1.5 exp 0.5(1 −γ)δ y,μ,σ 1 −ν +νγ 0.5 exp 0.5(1 −γ)δ y,μ,σ . Censored cases: In the censored cases, we have ρ = 1. For the sake of notation, let Semiparametric estimates of the relation between weather and electricity sales Semi-parametric accelerated failure time regression analysis with application to interval-censored HIV/AIDS data Penalized spline estimation in the partially linear model Efficient estimation in the partially linear quantile regression model for longitudinal data Parametric and semiparametric estimation methods for survival data under a flexible class of models Multivariate t semiparametric mixed-effects model for longitudinal data with multiple characteristics Root-n-consistent semiparametric regression On estimating regression Smooth regression analysis Quasi-likelihood estimation in semiparametric models Estimation in nonlinear mixed-effects models using heavy-tailed distributions Linear censored regression models with scale mixtures of normal distributions Nonlinear censored regression models with heavy-tailed distributions The ecme algorithm: a simple extension of em and ecm with faster monotone convergence On polya frequency functions iv: The fundamental spline functions and their limits Spline functions: basic theory Polynomial spline estimation and inference of proportional hazards regression models with flexible relative risk form Maximum likelihood estimation via the ecm algorithm: A general framework Tutorial on methods for interval-censored data and their implementation in r Censored mixed-effects models for irregularly observed repeated measures with applications to HIV viral loads Robust regression with application to symbolic interval data Applied regression analysis and generalized linear models A theory of extramarital affairs Econometric analysis. Pearson Education India Multivariate longitudinal data analysis with censored and intermittent missing responses Multivariate-t linear mixed models with censored responses, intermittent missing values and heavy tails Finite mixture of regression models for censored data based on scale mixtures of normal distributions Mixture of linear experts model for censored data: A novel approach with scale-mixture of normal distributions Bayesian analysis of mixture modelling using the multivariate t distribution Robust statistical modelling using the multivariate skew t distribution with complete and incomplete data The authors declare that they have no conflict of interest.Appendix A: Conditional expectations of the special cases of the SMN distributions Uncensored observations: For the uncensored data y, we have ρ = 0. Therefore, the only necessary conditional expectationû = E(U |Y = y,Θ) for the considered models can be computed as follows.• If Y ∼ N µ, σ 2 , in this case, U = 1 with probability one, and soû = 1.