key: cord-0257338-c18d69kh
authors: Paul, S.; Lorin, E.
title: Estimation of COVID-19 recovery and decease periods in Canada using machine learning algorithms
date: 2021-07-22
journal: nan
DOI: 10.1101/2021.07.16.21260675
sha: 5f473e31c13cf3089b8d19b9693223ad43750e07
doc_id: 257338
cord_uid: c18d69kh

We derive a novel model escorted by large scale compartments, based on a set of coupled delay differential equations with extensive delays, in order to estimate the incubation, recovery and decease periods of COVID-19, and more generally any infectious disease. This is possible thanks to machine learning algorithms applied to publicly available database of confirmed corona cases, recovered cases and death toll. In this purpose, we separate i) the total cases into 14 groups corresponding to 14 incubation periods, ii) the recovered cases into 406 groups corresponding to a combination of incubation and recovery periods, and iii) the death toll into 406 groups corresponding to a combination of incubation and decease periods. In this paper, we focus on recovery and decease periods and their correlation with the incubation period. The estimated mean recovery period we obtain is 22.14 days (95% Confidence Interval(CI): 22.00 to 22.27), and the 90th percentile is 28.91 days (95% CI: 28.71 to 29.13), which is in agreement with statistical supported studies. The bimodal gamma distribution reveals that there are two groups of recovered individuals with a short recovery period, mean 21.02 days (95% CI: 20.92 to 21.12), and a long recovery period, mean 38.88 days (95% CI 38.61 to 39.15). Our study shows that the characteristic of the decease period and the recovery period are alike. From the bivariate analysis, we observe a high probability domain for recovered individuals with respect to incubation and recovery periods. A similar domain is obtained for deaths analyzing bivariate distribution of incubation and decease periods.

The outbreak of coronavirus disease 2019 (COVID- 19) , reported early in Wuhan (China) 1 and spread around the world, is creating dramatic and daily changes with profound impacts worldwide. As a consequence the outbreak was declared a pandemic by the World Health Organization (WHO) in March 2020 2 , and by the end of 2020, COVID-19 has infected about 79.2 millions of people in the world, with an approximate cumulative global mortality of 3.2% 2 . To limit the impact of this deadly virus, a rapid and widespread vaccination of the population is now in place. However, it is established that vaccine are not 100% effective to stop the transmission or infection of COVID- 19 In this circumstance to get a complete feature of COVID-19, it is essential to fully understand the key (incubation, recovery and decease) periods.

We already successfully estimated the incubation period of COVID-19 in Canada 3 . In the present context, we focus on the recovery and decease periods and their correlation with the incubation period. In the current framework, we define the recovery period as the time from the contraction of the coronavirus to recovery, i.e., the incubation period plus the onset time from the symptom to recovery; the latter is the same as the viral shedding of SARS-CoV-2. We describe the decease period in the same way as the recovery period. Understanding the recovery period of disease is very useful information in the struggle against the disease. If the incidence of a disease is remarkably high and the recovery period of the disease is also high then the prevalence of the disease in the country is likely to increase which in turn puts extra health, economic and social burden on this country. Understanding the recovery period of the disease will help governments to plan proper strategies to counter the disease and to organize the requirements such as hospitals, doctors, medical staffs, medical equipment's, etc. It will also help to implement different social and economic policies which will be essential to fight the disease.

There are several statistical studies [4] [5] [6] [7] [8] [9] [10] [11] , based on various samples of patients such as severe, non-severe, ICU, non-ICU, large size, small size, meta-analysis, estimated the recovery time of the current pandemic. In addition to those statistical approaches, there are numerous analytical and computational studies based on mathematical models, involving Ordinary 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 22, 2021. ; . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 22, 2021. ; Figure 2 . Distribution of the recovery and decease periods: Results based on the total recovered cases of the first 177 days during the pandemic in Canada starting from January 22, 2020 i.e., cumulative data as of July 16,2020. (a) Splitting values of recovered individuals as a function of incubation and recovery periods. (b) Probability density function of the gamma distribution Γ(ζ , K, θ ) with K = 18.62067 and θ = 1.18892. The blue bars indicate the densities obtained from the model calculation. (c) Probability density function of the bimodal gamma distribution 0.9365 Γ(ζ , K 1 , θ 1 ) + 0.0635 Γ(ζ , K 2 , θ 2 ) with K 1 = 34.55447, θ 1 = 0.60847, K 2 = 226.40545 and θ 2 = 0.17171. The blue bars indicate the densities obtained from the model calculation. (d) Percentile curves for unimodal and bimodal gamma distributions. (e) Splitting values of the deaths as a function of incubation and decease periods. (f) Probability density function of the gamma distribution Γ(η, K, θ ) with K = 21.33660 and θ = 1.03174. The blue bars indicate the densities obtained from the model calculation. (g) Probability density function of the bimodal gamma distribution 0.9508 Γ(η, K 1 , θ 1 ) + 0.0492 Γ(η, K 2 , θ 2 ) with K 1 = 35.00855, θ 1 = 0.60511, K 2 = 186.11379 and θ 2 = 0.20636. The blue bars indicate the densities obtained from the model calculation.

(h) Percentile curves for unimodal and bimodal gamma distributions.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 22, 2021. ; Figure 3 . Bivariate distribution of the incubation and recovery periods: (a) Histogram of the estimated data R ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29 using the model. (b) Fitted bivariate normal distribution. (c) Two-dimensional display of (b); the red region is the highly probable domain for recovery, and x (6.43, 21.91) denotes the center of the region. (d) Fitted nonparametric density estimate with wide 33%; two peaks show that there are two distinguishable high probable regions. (e) Two-dimensional display of (d); two x represent the centers of two high probable regions (3.49, 20.52) and (8.38, 20.35 ).

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . Bivariate distribution of the incubation and decease periods: (a) Histogram of the estimated data D ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29 using the model. (b) Fitted bivariate normal distribution. (c) Two-dimensional display of (b); the red region is the highly probable domain for decease, and x (6.56, 21.64) denotes the center of the region. (d) Fitted nonparametric density estimate with wide 40%; two peaks show that there are two distinguishable high probable regions. (e) Two-dimensional display of (d); the x represent the centers of the high probable red region (8.17, 21.86).

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 22, 2021. distributions, unimodal ( Fig.2 (f)) Γ(η, K d , θ d ) and bimodal ( Fig.2(g) ) 0.9508 Γ(η, K d1 , θ d1 ) + 0.0492 Γ(η, K d2 , θ d2 ). Here, the variable η indicates the decease period and the parameters K d = 21.33660, θ d = 1.03174, K d1 = 35.00855, θ d1 = 0.60511, K d2 = 186.11379 and θ d2 = 0.20636 with statistical p value less than 0.01 for Γ(η, K d , θ d ), Γ(η, K d1 , θ d1 ) and equal to 0.18 for Γ(η, K d2 , θ d2 ) . The mean decease period we obtain using an unimodal gamma distribution is 22 31.10) . For better estimation, we use a bimodal distribution, a linear combination of Γ(η, K d1 , θ d1 ) and Γ(η, K d2 , θ d2 ). The mean of Γ(η, K d1 , θ d1 ) and Γ(η, K d2 , θ d2 ) are 21.18 days (95% CI 20.90 to 21.47) and 38.41 days (95% CI 37.41 to 39.40), respectively. The percentile curves show ( Fig.2(h) ) that the percentiles of unimodal and bimodal distributions are almost the same.

To analyze the bivariate distribution, we use the software Statgraphics 29 , based on the statistical package R. Using the elements R ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29, we obtain a bivariate histogram ( Fig.3(a) ) for the incubation and recovery periods. There are two peaks at the points (3, 19) , i.e., for τ i = 3 and ζ k = 19, and (8, 20) , i.e., for τ i = 8 and ζ k = 20, corresponding to the high densities of recovered individuals. We estimate the histogram using a bivariate normal distribution Fig.3(b) ) where the variables τ and ζ represent the incubation and recovery periods, respectively. The mean m (r) τ and standard deviation σ (r) τ of the incubation period are 6.43 (95% CI 6.27 to 6.59) and 3.06 (95% CI 2.96 to 3.18), respectively; the mean m ζ and standard deviation σ ζ of the recovery period are 21.91 (95% CI 21.63 to 22.18) and 5.33 (95% CI 5.14 to 5.53), respectively; the correlation between incubation and recovery periods ρ τζ is -0.11. The two dimensional representation of the bivariate normal distribution (Fig.3 (c))shows that the highly probable recovery region (red in the figure) is a nested domain of τ = 6.43 and ζ = 21.91. To precisely analyze the highly probable region, we estimate the histogram ( Fig.3 (a)) using a nonparametric density function with a width of 33%, low and high percentage give a more local and global estimation, respectively, and we obtain a distribution with two peaks (Fig.3(d) ). Two distinguishable peaks indicate that there are two separate highly probable regions surrounding the points τ = 3.49, ζ = 20.52 and τ = 8.38, ζ = 20.35 (Fig.3(e) ). The bivariate mixture distribution analysis shows that we can estimate the histogram of R ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29 using a combination of two bivariate normal distributions, 0.94N (m

where the superscript 1 (resp. 2) represents the parameters for the first (resp. second) component, respectively. The parameters of the first component are m 

Using the elements D ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29, we obtain a bivariate histogram ( Fig.4(a) ) for the incubation and decease periods. There are two peaks at the points (3, 22) , i.e., for τ i = 3 and η k = 22, and (9, 23), i.e., for τ i = 9 and η k = 23, corresponding to the high densities of deaths. We estimate the histogram using a bivariate normal distribution Fig.4(b) ) where the variables τ and η represent the incubation and decease periods, respectively. The mean m (d) τ and standard deviation σ (d) τ of the incubation period are 6.56 (95% CI 6.36 to 6.76) and 3.00 (95% CI 2.86 to 3.15), respectively; the mean m η and standard deviation σ η of the decease period are 21.64 (95% CI 21.33 to 21.94) and 4.43 (95% CI 4.23 to 4.65), respectively; the correlation between incubation and decease periods ρ τη is -0.008. The two dimensional representation of the bivariate normal distribution ( Fig.4 (c)) shows that the highly probable decease region (red in the figure) is a nested domain of τ = 6.56 and η = 21.64. To precisely analyze the highly probable regions, we estimate the histogram ( Fig.4(a) ) using a nonparametric density function with a width of 40%, low and high percentage give a more local and global estimation, respectively, and obtain a distribution with two peaks (Fig.4(d) ), one in the high probability region (red in figure) and another one in the second high probability region (yellow in figure) . The highly probable region is surrounding the point τ = 8.17, η = 21.86 ( Fig.4(e) ). The bivariate mixture distribution analysis shows that we can estimate the histogram of D ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29 using a combination of two bivariate normal

τη ) where the superscript 1 (resp. 2) represents the parameters for first (resp. second) component. The parameters of the first component are m

τη = 0.03, and those of the second component are m

Onset time from symptom to recovery Using the fact that τ + θ = ζ and the property of expectation E(T + Θ) = E(T ) + E(Θ), we calculate the mean Onset Time from Symptom to Recovery (OTSR) E(Θ) (Table1), where θ is the variable corresponding to θ ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29; T and Θ are the random variables corresponding the incubation period and OTSR, respectively. There is a good agreement between the calculated values, mean of OTSR, short OTSR and long OTSR, with the reported works (Table1)

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 22, 2021. ;  of earlier studies. However, these calculated values do not show excellent concordance with some other studies, because we consider all recovery cases, mild to moderate, severe, hospitalized (ICU, non-ICU), non hospitalized, in Canada. For example, Voinsky et. al. 4 reported a study with a sample of 5769 patients, not including severe COVID-19 cases. In fact, they mentioned that severe cases were reported to be discharged from the hospital on average 8 days longer than mild to moderate patients requiring hospitalization.

In the present context, we estimate the recovery as well as decease periods using a novel compartment based model and publicly available database. Here, we consider a maximum length of the incubation period of 14 days, and the ranges of the recovery and decease periods are from 2 to 6 weeks. However, in our method, we can go well beyond all those ranges; the longer ranges simply require a long computational time. Notice that our method could apply the proposed model to estimate key periods for any infectious disease, as along as similar data are available.

The multi-group database R ik , i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29, generated from the model, is the key source to compute all types of distribution of the recovery period, univariate, bimodal and bivariate. The bimodal gamma distribution of the recovery period, 0.9365 Γ(ζ , K r1 , θ r1 ) + 0.0635 Γ(ζ , K r2 , θ r2 ), demonstrates that the recovery period of 93.65% recovered individuals obeys the distribution Γ(ζ , K r1 , θ r1 ), and that of 6.35% recovered individuals obeys the distribution Γ(ζ , K r2 , θ r2 ). Thus, there are two groups of recovered individuals with short recovery period, 21.02 days (on average), and long recovery period, 38.88 days (on average). The characteristics of those two groups may depend on age, underlying health condition, immunity, etc. The database of numerous groups D ik , i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29, generated from the model, is the key source to compute all types of distribution of the decease period, univariate, bimodal and bivariate. The bimodal gamma distribution of the decease period, 0.9508 Γ(ζ , K d1 , θ d1 ) + 0.0492 Γ(ζ , K d2 , θ d2 ), demonstrates that the decease period of 95.08% deaths obeys the distribution Γ(ζ , K d1 , θ d1 ), and that of 4.92% deaths obeys the distribution Γ(ζ , K d2 , θ d2 ). Thus, there are two groups of deaths with short decease period, 21.18 days (on average), and long decease period, 38.41 days (on average). The characteristics of those two groups may depend on age, underlying health condition, immunity, etc. The calculated results employing the proposed model show that the recovery and decease periods are the same. It seems that the survival period of the coronavirus is the same as that of human, in the form of immunity.

The bivariate normal distribution of incubation and recovery periods indicates a recovery window of 4.82 ≤ τ ≤ 8.49 and 19.27 ≤ ζ ≤ 25.72 as the highly probable domain for recovery. The bivariate normal distribution of incubation and decease periods indicates a decease window of 4.55 ≤ τ ≤ 8.45 and 19.35 ≤ η ≤ 24.85 as the highly probable domain of deaths. The study shows that the recovery and decease windows almost coincide within these key periods. To determine precisely the recovery as well as the decease windows, we use nonparametric distributions. Under the nonparametric analysis we identify two recovery windows, 2.27 ≤ τ ≤ 4.38, 18.41 ≤ ζ ≤ 22.79 and 6.42 ≤ τ ≤ 9.63, 17.81 ≤ ζ ≤ 23.93 , and one decease window, 6.34 ≤ τ ≤ 9.41, 20.17 ≤ η ≤ 23.69. Nonparametric analysis provides some discrepancy between the recovery and decease windows.

The bivariate mixed distribution, 0.94N (m (r1)

ζ , ρ

τζ ), of the incubation and recovery periods demonstrates that 94% recovered individuals obey N (m (r1)

τζ ), the bivariate normal distribution, with recovery window m

ζ and 6% recovered individuals obey the bivariate normal distribution N (m

τη ), of the incubation and decease periods demonstrates that 97% deaths obey the bivariate

η and 3% deaths obey the bivariate normal distribution N (m

In summary, we have developed a novel compartment based model to divide the publicly available database of total confirmed cases, recovered cases, and number of deaths into numerous subgroups to obtain univariate and bivariate distributions. The model itself and the procedure to solve it, are the core of this work, and it can be applied to any infectious disease in any region provided that similar data are available. In conclusion, we obtain the distributions of the key periods from the population, considering all types of cases (non-hospitalized, non-ICU, ICU) of recovered individuals and deaths, which is naturally better than any sample-dependent result. In this approach, we do not need any clinical survey; the publicly available data on confirmed cases, recovery and death toll, are sufficient to analyze univariate and bivariate distributions. The current model can be extended to study age-based key periods, and for this purpose we need an age dependent database. The monotonic iteration scheme, introduced for better estimation, can be applied to machine learning as well as numerical analysis problems.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.16.21260675 doi: medRxiv preprint

In this section, we introduce a compartment based infectious disease model including a large number of partitions, Lockdown, Susceptible, Removed, Infected, fourteen compartments of Confirmed cases, hundreds compartments of Recovered and Deaths. The model is constructed as a set of coupled delay differential equations involving few thousands of variables and parameters, and will be used, not as a prediction tool, but i) for constructing the myriad groups of recovered individuals and death tools and ii) estimating accurately the recovery and decease periods. This model will however have to be parameterized and validated using existing data, in order to justify its accuracy and its application in the proposed methodology.

Modeling the spread of epidemic is an essential tool for projecting its outcome. By estimating important epidemiological parameters using the available database and machine learning techniques, we can make predictions of different intervention scenarios. Compartment based model, where the population of a region is distributed into several population groups, such as susceptible, infected, total cases, etc., is a simple but useful tool to demonstrate the panorama of an epidemic.

The proposed model is an extension of our previous work 3 , including a very large number of compartments of recovered and deaths individuals; the schematic diagram of the model is presented in Fig. 5(a) . The following are the underlying principles of the present model.

• The total population is constant (neglecting the migrations, births and unrelated deaths) and initially every individual is assumed susceptible to contract the disease.

• The disease is spread through the direct (face-to-face meeting) or indirect (through air current, common used or delivery items like door handles, grocery products) contact of susceptible individuals with the infected individuals.

• The quarantined area or the compartment for corona cases contains only members of the infected population who are tested corona-positive.

• The virus kills a part of the people it infects; the survivors represent the recovered group.

• There is a non-pharmaceutical policy (stay at home), commonly known as lockdown, to stop the spread of the disease.

• The group of asymptomatic patients is a part of infected individuals, and the never-tested recovered asymptomatic patients can be removed from the infected group. If an asymptomatic patient dies, it is counted after investigation.

Based on the above principles, we consider the following compartments:

• Lockdown (insusceptible) (L): the group of persons who are keeping themselves safe.

• Susceptible (S): the group of individuals who can be infected.

• Infected (I): the group of people who are spreading the contiguous disease.

• Removed (V ): the group of recovered asymptomatic patients without testing.

• Confirmed cases (C): the group of individuals who tested corona-positive.

• Recovered (R): the group of recovered individuals who tested corona-positive.

• Deaths (D): the group of deaths individuals who tested corona-positive.

In the present context, we assume that there is no overlap between these two compartments, infected (I) and confirmed cases (C). In other words, tested corona-positive individuals are assumed to be unable to substantially spread the disease due to isolation and are immune to re-infection after recovery 30 . The aim of the present work is to estimate the distribution of the recovery and decease periods of COVID-19. In this goal, we split the compartment C into J subcomponents C 1 , · · · ,C J , the compartment R into J × M subcomponents R ik for i = 1, · · · , J and k = 1, · · · , M and the compartment D into J × M subcomponents D ik for i = 1, · · · , J and k = 1, · · · , M where

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 22, 2021.

In (1), (2) ik represents the total corona-positive cases corresponding the incubation period τ i , recovered individuals corresponding the incubation period τ i and onset time θ ik i.e., recovery period ζ k = τ i + θ ik and death toll corresponding the incubation period τ i and onset time µ ik i.e., decease period η k = τ i + µ ik , respectively, presented in Fig.5(a) .

The time-dependent model is the following set of coupled delay differential equations, for i = 1, · · · , J:

where the real positive modeling parameters α, β , γ δ i , λ ik , κ ik and ν are the rate of lockdown, the rate of infection, the rate of recovery from the asymptomatic group, the rate of tested corona-positive corresponding the incubation period τ i , the rate of recovery corresponding the recovery period ζ k , the rate of decease corresponding the decease period η k and the rate of transit from lockdown compartment to susceptible compartment, respectively. The variables S(t − τ i ) and I(t − τ i ) denote the cumulative data of (t − τ i ) days, i.e., total number of suspected and infected individuals of (t − τ i ) days. The

convey the rate of individuals who were infected τ i days ago, the rate of individuals who were infected τ i + θ ik days ago and recovered, the rate of individuals who were infected τ i + µ ik days ago and died, respectively. It follows from (4), that for any t

where N (constant) is the total population size. We can define a group of new variable T i for i = 1, · · · , J such that

and

where, T , total confirmed cases, is the group of individuals who tested corona positive (active cases + recovered + deaths).

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.16.21260675 doi: medRxiv preprint From Eq.(4) we can generate three different sets of coupled delay differential equations for i = 1, · · · , J and k = 1, · · · , M

and

where Eq.(8), (9) and (10) can be used to calculate incubation period 3 , recovery period and decease period, respectively. In the present context, we focus on recovery as well as decease periods. We solve Eq.(8), (9) and (10) using matlab inner-embedded function dde23 with particular sets of model parameters. To solve the initial value problem, in the interval [t 0 ,t 1 ], we consider L(t 0 ), S(t 0 ), I(t 0 ), T (t 0 ), R(t 0 ) and D(t 0 ) as follows:

where T (t 0 ), R(t 0 ) and D(t 0 ) are the available data at time t 0 , and q is the initial value adjusting parameters. Initially, there are no lockdown individual and no removed individuals from asymptomatic group so that we can consider L(t 0 ) = 0 and V (t 0 ) = 0. It follows from (7) and (11)

In the present contextT (t 0 ) = 0, since there were no corona-positive cases reported on January 22, 2020. As a consequence, we also take T i (t 0 ) = 0 for i = 1, 2, · · · , J, and the similar assumptions are valid for R ik (t 0 ) and D ik (t 0 ) i.e., R ik (t 0 ) = 0 and D ik (t 0 ) = 0 for i = 1, 2, · · · , J and k = 1, 2, · · · , M.

We focus on the exponential growth phase of the COVID-19 epidemic in Canada; one can use this approach to estimate the incubation, recovery period and decease periods for any region affected by this infectious disease. The time resolved (daily

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 22, 2021. ; 

, · · · , J and k = 1, 2, · · · , M. We consider J = 14 and M = 29. (b) Bubble diagram of the foundation of the present work, splitting publicly available database, total cases (T), recovered individuals (R) and death toll (D), into myriad groups. (c) Sketch of the Monotonic Iteration Scheme (MIS); for 'recovery' calculation ∆ i = {λ ik |k = 1, 2, · · · , 29} and for 'decease' calculation ∆ i = {κ ik |k = 1, 2, · · · , 29} and i = 1, 2, · · · , 14. (d) Sketch of the optimization scheme for the primary, P 0 , and secondary, P 1 and P 2 , parameters. P 1/2 indicates either P 1 or P 2 . 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 22, 2021. updated) database 27 provides the number of total corona-positive cases, the number of recovered individuals and the death toll. We define two groups of model parameters: primary parameters, the parameters involved in Eq. (8) i.e., q, α, β , γ, δ i for i = 1, 2, · · · , J and ν, and secondary parameters, the parameters involved in Eq.(9) and (10) other than the primary parameters i.e., λ ik and κ ik for i = 1, 2, · · · , J and k = 1, 2, · · · , M. We use the estimated values of the primary parameters to optimize the secondary parameters. The optimal values of the primary parameters P 0 = (q, α, β , δ 1 (t), · · · , δ J (t), ν) T , q is the initial value of I(t), is obtained by minimizing the error function Er(P 0 ), defined as

where T (m) is the available data of total corona-positive cases on the particular mth day, and T (m) is the calculated results obtained from the system (8) . The integer K, used in (13), is the size of the data set. Due to the complexity of the error function, the minimization using the matlab function fminsearch requires a very large number of iterations. We use the similar error functions Er(P 1 ) and Er(P 2 ) to optimize the secondary parameters P 1 = (λ 11 , · · · , λ JM ) T and P 2 = (κ 11 , · · · , κ JM ) T , defined as

and

where P op 0 is the estimated values of P 0 ; R (m) and D (m) are the available data of total number of recovered individuals and total number of death toll; R (m) and D (m) are the calculated results obtained from the Eq.(9) and (10), respectively.

In this section, we present a detailed description of the computational procedure for the proposed model. On 23 January 2020, a 56-year old man admitted to Toronto hospital emergency department in Toronto with a new onset of fever and nonproductive cough, and returning from Wuhan, China, the day prior 31, 32 . It is believed this is the first confirmed case of 2019-nCoV in Canada, and according to the government report, the novel coronavirus arrived on the Canadian coast on January 25, 2020, first reported case. The above information suggests that the start date of the current pandemic in Canada is possibly to be January 22, 2020. Additionally, some research studies reported that the estimation of the incubation period is from 2 days to 14 days, and recovery as well as decease period of COVID-19 is from 2 weeks to 6 weeks 2, 33 . As a consequence, in the present study we consider J = 14 i.e, 14 delays for the incubation period, and M = 29 i.e, 29 delays for the recovery as well as decease periods. Here we consider a calculation of 177 days, from January 22, 2020 to July 16, 2020, duration of the first wave in Canada. The purpose of the model is to separate the publicly available database T , R and D into myriad groups T i , R ik and D ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29 ( Fig. 5(b) ).

Then the local minimum computed by the optimization algorithm depends on the initial values of the parameters: for q, α, β , ν we consider any positive random number less than unity, where as a choice of δ = (δ 1 , · · · , δ 14 ) T is tricky. For this purpose, we consider a vector of 14 positive random numbers δ such that δ 1 < · · · < δ 4 < δ 5 > δ 6 > · · · > δ 14 and ∑ 14 i=1 δ i = 0.9. We observe, from numerous numerical experiments, the renormalization factor 0.9 works perfectly for the computation. The estimated values of the primary parameters P op 0 are presented in Fig.5(e) , and the value of the error function Er(P 0 ) = 41.64. The estimated values of the primary parameters are related to Eq.(8), the set of coupled delay differential equations, and Eq.(13), the error function. Using the estimated values of the primary parameters, we optimize the secondary parameters λ ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29 related to Eq.(9) and (14) . The choice of the initial values of λ ik is such that for any fixed i, i = 1, 2, · · · , 14, the first fourteen λ ik s i.e., {λ i1 , λ i2 , · · · , λ i14 } are in ascending order, and the rest i.e., {λ i15 , λ i16 , · · · , λ i29 } are in descending order; and ∑ 29 k=1 λ ik = 0.72. After optimization, we obtain the value of the error function Er(P 1 op ) = 236.47. Using the estimated values of the primary parameters, we optimize the secondary parameters κ ik for i = 1, 2, · · · , 14 and k = 1, 2, · · · , 29 related to Eq.(10) and (15) . The choice of the initial values of κ ik is such that for any fixed i, i = 1, 2, · · · , 14, the first fourteen κ ik s i.e., {κ i1 , κ i2 , · · · , κ i14 } are in ascending order, and the rest i.e., {κ i15 , κ i16 , · · · , κ i29 } are in descending order; and ∑ 29 k=1 κ ik = 0.1. After optimization, we obtain the value of the error function Er(P 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.16.21260675 doi: medRxiv preprint

To optimize the parameters ξ ik for i = 1, 2, · · · , J and k = 1, 2, · · · , M, we use a MIS with J = 14 and M = 29. However, the method can be applied for any finite integer values of J and M. The schematic diagram of MIS is presented in Fig.5(c) , and consists of the following steps.

• Step 1: We decompose the parametric domain ∆ = {ξ ik |i = 1, 2, · · · , 14; k = 1, 2, · · · , 29} into 14 subdomains ∆ i = {ξ ik |k = 1, 2, · · · , 29} so ∆ = {∆ 1 , ∆ 2 , ∆ 3 , · · · , ∆ 14 }.

•

Step 2: We optimize the subdomain ∆ 1 and consider the other parameters ∆ 2 , ∆ 3 , · · · , ∆ 14 , as constants. After first iteration, we get estimated parameters ∆ op 1 ; the entire parametric domain is ∆ (1) = {∆ op 1 , ∆ 2 , ∆ 3 , · · · , ∆ 14 }, and the error function Er(∆ (1) ).

• Step 3: In the second iteration, we optimize the subdomain ∆ 2 and keeping the other subdomains of ∆ (1) unchanged.

After second iteration, we get estimated parameters ∆ , ∆ 3 , · · · , ∆ 14 }, and the error function Er(∆ (2) ).

• Step 4: Repeated the same procedure discussed in Step 3.

The optimization of the subdomain ∆ 2 , demonstrated in Step 3, is related to minimizing the error function such that Er(∆ (1) ) ≥ Er(∆ (2) ); the equality sign holds for ∆ op 2 = ∆ 2 . The error function of the n + 1th iteration, Er(∆ (n+1) ) cannot be greater than that of nth iteration, Er(∆ (n) ), because of this characteristic of the error function we define the approach as MIS. The flow chat of the optimization scheme is presented in Fig. 5(d) . The upper and lower panels of Fig.5(f) show the estimated values of the secondary parameters λ ik and κ ik , respectively, obtained from the MIS. The upper and lower panels of Fig.5(g) show the values of the error functions E R , using MIS to optimize λ ik , and E D , using MIS to optimize κ ik , for fourteen iteration steps and Er(P 

Clinical features of patients infected with 2019 novel coronavirus in wuhan

Rolling updates on coronavirus disease (COVID-19)

Distribution of incubation periods of COVID-19 in the

Effects of age and sex on recovery from COVID-19: Analysis of 5769 israeli patients

COVID-19 pandemic and its recovery time of patients in India: A pilot study

COVID-19 in a designated infectious diseases hospital outside Hubei Province

Comparisons of viral shedding time of SARS-CoV-2 of different samples in ICU and non-ICU patients

Prolonged presence of SARS-CoV-2 viral RNA in faecal samples

Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study

Predictors of the prolonged recovery period in COVID-19 patients: a cross-sectional study

Inferred duration of infectious period of SARS-CoV-2: rapid scoping review and analysis of available evidence for asymptomatic and symptomatic COVID-19 cases

Lockdown: a non-pharmaceutical policy to prevent the spread of COVID-19. Mathematical modeling and computation

A hybrid multi-scale model of COVID-19 transmission dynamics to assess the potential of non-pharmaceutical interventions

COVID-19: Development of a robust mathematical model and simulation package with consideration for ageing population and time delay for control action and resusceptibility. Phys. D: Nonlinear Phenom

Modeling the dynamics of the COVID-19 population in Australia: A probabilistic analysis

Early dynamics of transmission and control of COVID-19: a mathematical modelling study. The lancet infectious diseases

A mathematical model for simulating the phase-based transmissibility of a novel coronavirus

Estimation of the transmission risk of the 2019-ncov and its implication for public health interventions

Transmission dynamics of COVID-19 in Wuhan, China: effects of lockdown and medical resources

Modeling and forecasting the early evolution of the COVID-19 pandemic in Brazil

Analysis of a vector-borne diseases model with a two-lag delay differential equation. The North Carol

New approximations, and policy implications, from a delayed dynamic model of a fast pandemic

Stability analysis of an age-structured seirs model with time delay

Elementary time-delay dynamics of COVID-19 disease. medRxiv

A time delay dynamic system with external source for the local outbreak of 2019-ncov

Solvable delay model for epidemic spreading: the case of COVID-19 in Italy

Does incubation period of COVID-19 vary with age? a study of epidemiologically linked cases in singapore

centurion XIX Version 19.2.01

Age-dependent effects in the transmission and control of COVID-19 epidemics

Diagnosis and management of first case of COVID-19 in Canada: Lessons applied from SARS

First imported case of 2019 novel coronavirus in Canada, presenting as mild pneumonia

The data used to estimate the model parameters are publicly available and are available with the code.

All code is available in the GitHub repository for the project at https://github.com/SPAUL2021/COVID19RECOVERY/tree/SPAUL2021-patch-1 Author contributions statement S.P. has derived the model, has developed the matlab code, has analyzed the calculated results, and has prepared all figures and tables. S.P. and E.L. have drafted the original article. Both authors have contributed to the editing of the article. Both authors have read and approved the final article.

The authors declare that they have no competing interests.