key: cord-0154754-juhl27n9 authors: Badmus, N. I.; Faweya, O.; Ige, S. A. title: Parametric Modeling Approach to COVID-19 Pandemic Data date: 2021-09-13 journal: nan DOI: nan sha: a15ec074c46d530aa7ef26c0b31eaa24300f457f doc_id: 154754 cord_uid: juhl27n9 The problem of skewness is common among clinical trials and survival data which has being the research focus derivation and proposition of different flexible distributions. Thus, a new distribution called Extended Rayleigh Lomax distribution is constructed from Rayleigh Lomax distribution to capture the excessiveness of some survival data. We derive the new distribution by using beta logit function proposed by Jones (2004). Some statistical properties of the distribution such as probability density function, cumulative density function, reliability rate, hazard rate, reverse hazard rate, moment generating functions, likelihood functions, skewness, kurtosis and coefficient of variation are obtained. We also performed the expected estimation of model parameters by maximum likelihood; goodness of fit and model selection criteria including Anderson Darling (AD), CramerVon Misses (CVM), Kolmogorov Smirnov (KS), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Consistent Akaike Information Criterion (CAIC) are employed to select the better distribution from those models considered in the work. The results from the statistics criteria show that the proposed distribution performs better with better representation of the States in Nigeria COVID-19 death cases data than other competing models. In survival analysis, problems are encountered in the analysis of clinical data because distributions proposed are not flexible enough to follow the movement of the data to give accurate results. In the light of this, there is need to develop a more flexible parametric model using COVID-19 data for example. In recent times, there was outbreak of the third wave of COVID-19 pandemic called Delta Variant after the second wave generating a global outcry. Many researches/ works have been done by several researchers since the breakup of the pandemic in December 2019 from various fields such as: Medicine, Statistics, Economics etc., with different ideas, models, methods and approaches in their respective works. These include: Badmus et al. 2020 , Dey et al. 2020 , WHO, 2020, Yoo, 2020 amongst others. Most clinical data are always skewed, thus a new distribution is constructed and generated from a parent distribution called Rayleigh Lomax (RL) distribution by Kawsar et al. (2018) is generated using beta link function introduced by Jones (2004) . This is expected to have different shapes for the survival and hazard rate functions. More parameters are added to the parent distribution, the flexibility and the ability of the distribution to model real life data are established 2. There are several methods in literature which have been used by many researchers. In this study, we consider beta logit function introduced by Jones (2004), which can jointly convolute two or more distributions. The probability density function of the above distribution is obtained using the beta link function given as: (g) If = = = = 1 in (2), it yields Rayleigh Lomax distribution which is the parent distribution. (see Siddiqui, 1962 ). The associative cumulative distribution function cdf in (2) is given as We set Putting dx in equation (2), we realize: And k in equation (4) becomes Equation (5) can be expressed as Expression (7) becomes the cumulative distribution function of ERL distribution. The reliability function of BRL distribution is given by The ERL distribution is a probability density function with the use of: Jones (2004) in his generalized beta distribution of first kind is given by: where , and > 0, therefore differentiating ( ) above, we obtain Hence, the distribution has a true continuous probability density function. In this section, we derive and obtain the moment generating function (mgf) of the distribution ( ) = ( )and the general ℎ moment of a beta generated distribution defined by Hosking (1990) Cordeiro et al. (2011) also discussed another mgf for generated beta distribution. where, Putting the pdf and cdf of the Extended Rayleigh Lomax distribution into equation (14), we get If = = 1 in equation (14) that becomes the moment generating function of the baseline distribution. Hence, the ℎ moment of the ERL distribution is obtained, since the moment generating function of the parent distribution is given by Equation (16) can be re written as are also obtained below: The ℎ moment of the ERL distribution is written as: Where, At the same time, the first four central moments µ = 1, 2, 3, 4 are obtained through (17) as: Furthermore, the mean and second to fourth moments of the ERL distribution are given as follows: µ = µ 1 ⃓ , µ 2 = µ 2 ⃓ − µ 2 , µ 3 = µ 3 ⃓ − 3µµ 2 ⃓ + 2µ 3 , and µ 4 = µ 4 ⃓ − 4µµ 3 ⃓ + 6µ 2 µ 2 ⃓ − 3µ 4 Other measures such as skewness, kurtosis and coefficient of variation of the ERL distribution are given below: The skewness is a means of measuring non symmetry of the distribution. The skewness is given by: The kurtosis is another measure that measures the peak of the distribution. The kurtosis of the BRL distribution is given as: This is also a measure of variability of a probability distribution. The CV of the ERL distribution is given as: We made attempt to derive the maximum likelihood estimates (MLEs) of the ERL distribution parameters including: θ, λ, β, and which are scale and shape parameters. According to Cordeiro The log likelihood function of ERL distribution is given as: Taking the differentiation in respect to a, b, θ, λ and β give the following: The data used for the analysis is a secondary data obtained from COVID-19 situation weekly epidemiological report 39; 5 th -11 th July, 2021 (NCDC website state the website): Thirty-six (36) States including federal capital territory (FCT) with reported laboratory-confirmed COVID-19 cases, recoveries, deaths, samples tested and active cases (37 data points); and was accessed on Thursday 22 nd July, 2021 put date accesses at reference not here. Only the death cases from all states of the federation are used for the analysis. The summary of goodness of fit statistics is used to check for normality of the data; skewness, kurtosis, Anderson Darling (AD), Kolmogorov Smirnov (KS) and Cramer-Von-Mises (CVM) shown in Table 1 with their values clearly indicate that the data does not follow normal distribution since p-values less than 5%, skewness greater than 0 (zero) and kurtosis also greater than 3 (Karadimitriou and Shivam Mishra (2020). While, graphs from figure 2 show the nature of the data, the scatter, theoretical quantiles, boxplot, histogram, density and empirical cumulative distribution function (ecdf) plot show the data is skewed. For instance, non-linearity by scatter and quantiles plots, outliers by boxplot and skewness by histogram and density plots. The minimum and maximum values in the data set are inclusive. The results obtained in Table 2 are based on parameter estimates by method of maximum likelihood estimation (MLEs). The standard error values are in bracket for all the models. The model ERLD is compared with other six models ExpLD, LRLD, BRD, RLD, ExpRLD and BLD Also, model selection criterion is performed on all models considered in the study. From the results, ERLD has the smallest values in all as we can see bold and starred where = . , = . , = . and = . , which indicates that it is a robust and flexible model. Despite the level of Nigerian COVID-19 death cases data set, the ERL distribution follows the movement of the data and has better representation of the data than any of the other existing distributions. The proposed distribution being flexible and versatile, can accommodate increasing, decreasing, bathtub and unimodal shape hazard function. It is therefore useful and effective in the analysis of clinical and survival data. Modeling COVID-19 Pandemic Data with Beta Double Exponential Distribution A new family of generalized distributions Exponential Lomax Distribution L-moments Analysis and Estimation of distributions using Linear Combinations of Order Statistics Families of distributions arising from distributions of order statistics test Statistical Properties of Rayleigh Lomax distribution with applications in Survival Analysis States with Reported Laboratory Confirmed COVID-19 Cases, Recoveries, Deaths Samples Tested and Active Cases Accessed Normality Test with Python in Data Science Some problems connected with Rayleigh distributions Analyzing the epidemiological outbreak of COVID19: A visual exploratory data analysis approach Surveillance case definitions for human infection with novel coronavirus (nCoV) Yoo The fight against the nCoV outbreak: An arduous march has just begun