key: cord-0925508-c0kaqq76
authors: Hassan, Afshan; Prasad, Devendra; Rani, Shalli; Alhassan, Musah
title: Gauging the Impact of Artificial Intelligence and Mathematical Modeling in Response to the COVID-19 Pandemic: A Systematic Review
date: 2022-03-14
journal: Biomed Res Int
DOI: 10.1155/2022/7731618
sha: 293010fae64e5f01941967f4f40754f715d6ea8c
doc_id: 925508
cord_uid: c0kaqq76

While the world continues to grapple with the devastating effects of the SARS-nCoV-2 virus, different scientific groups, including researchers from different parts of the world, are trying to collaborate to discover solutions to prevent the spread of the COVID-19 virus permanently. Henceforth, the current study envisions the analysis of predictive models that employ machine learning techniques and mathematical modeling to mitigate the spread of COVID-19. A systematic literature review (SLR) has been conducted, wherein a search into different databases, viz., PubMed and IEEE Explore, fetched 1178 records initially. From an initial of 1178 records, only 50 articles were analyzed completely. Around (64%) of the studies employed data-driven mathematical models, whereas only (26%) used machine learning models. Hybrid and ARIMA models constituted about (5%) and (3%) of the selected articles. Various Quality Evaluation Metrics (QEM), including accuracy, precision, specificity, sensitivity, Brier-score, F1-score, RMSE, AUC, and prediction and validation cohort, were used to gauge the effectiveness of the studied models. The study also considered the impact of Pfizer-BioNTech (BNT162b2), AstraZeneca (ChAd0x1), and Moderna (mRNA-1273) on Beta (B.1.1.7) and Delta (B.1.617.2) viral variants and the impact of administering booster doses given the evolution of viral variants of the virus.

Since the 29th of December, 2019, the epidemic of a new coronavirus broke out starting from China that created havoc and dismay all over the world [1] . Coronavirus belongs to a family of viruses with positive-sense (+) RNA (ribonucleic acid), which have the capability of infecting the host by inducing the host with symptoms of cold and flu in its mild stage and severe respiratory ailments and multiorgan failure in its lethal stage [2] . This virus can infect humans, and several cases of pets getting infected ( Figure 1 ) have also been reported in different parts of the world. Some countries have a history of underreporting the disease, and it acts as a catalyst in the spread of infection. Lack of infrastructure, proper testing techniques, and high population may be the reasons for the spread of this deadly virus throughout the countries, continents, and subcontinents [3] . Various countries like China, Japan, and Singapore, which reported a higher number of cases initially in the first stage of the virus, have managed to slow down the rate of infection compared to countries like India and the U.S. [4] . The positive COVID-19 cases in India continue to rise; however, the lockdown, social distancing, and other measures have been implemented, and the measurable effect is yet to take place on a more significant note [5] .

This global pandemic has severe implications on people's health and negatively impacts businesses and the economy. On average, the cumulative cases of COVID-19 are increasing day by day; although, some countries like Canada, Taiwan, and Iceland have succeeded in flattening the curve [6] . However, with the global race for vaccine intensifying and theories about plasma technology and herd immunity coming to the surface, the apprehensions about its intensification seem to subside. Still, for some, it raises eyebrows [7] . Moreover, there is little knowledge of what challenges could arise during the development, which could further delay the timelines.

The spread of this newly emerging virus still holds uncertainty regarding current and future behavior; although, numerous studies suggesting that the trend worldwide have been reported. The role of the airline travel network seems to be pivotal in the spreading of COVID-19, which has led to the development of several mathematical modeling techniques that enable us to examine the present status and demonstrate the future predictions of any eventuality [8] . In addition, numerous researchers have studied the number of confirmed, recovered, and death cases within a specific time frame for various countries to identify the various stages in the plots among different states under study [9] . The possible outcomes in many studies show a positive relationship between global transportation networks and the spread of the disease [10] .

The disease can transmit either horizontally or vertically within the population. Horizontal transmission occurs through direct or indirect contact with infected individuals, whereas vertical transmission involves the transmission of diseases from mother to unborn offspring [11] . The implementation of lockdown and quarantine, restrictions implemented on social gatherings, somehow has provided some sort of relaxations and hence enough time for healthcare systems to prepare for the inevitable. Still, it seems pretty harsh to implement unprecedented stringent preventive measures to mitigate or contain the infection in different setups [12] .

The lack of efficacy for creating awareness among the masses, absence of effective measures, and medical equipment to ensure public health safety in the early stages of the spread of this virus led to its uncontrollable breakout. Several predictive models proposed so far for understanding the trend of COVID-19 employ variable datasets and deduce many disease-related parameters [13] . These models claim to hold the imprimatur of the science of COVID-19 disease transmission. Given the highly mutating nature of the virus, there is a risk that a more virulent or more transmissible mutation of the COVID-19 strain may crop up, resulting in the successive waves of COVID-19. This necessitates the study and deployment of appropriate surveillance and containment measures to contain the consecutive waves of the COVID-19 pandemic [14] . Therefore, an SLR study is a must to identify and understand the effective machine learning and mathematical models employed for mitigating the spread of COVID-19 while summarizing and marking the effective solutions from the identified literature. Section 2 epitomizes the background and motivation of the study. The methodology employed for performing systematic literature review (SLR) on the spread of COVID-19 is outlined in Section 3. Section 4 confers to the results and discussions on the identified research questions. The limitations of this study are addressed in Section 5. 2 BioMed Research International for vaccine intensifying, China is administering almost more than 2.6 billion and 1.3 billion doses of vaccine. India and the U.S. follow this (see Figure 2 ). However, there is still a concern about the successive waves of. COVID-19 hitting the different parts of the world attributed to the evolution of viral variants of the COVID-19 virus. Starting from February 2020 till Oct 2021, Asia was the center of infections emerging from Wuhan, China, in early 2019. More than 22 million cases were recorded in Brazil, with 617,000 deaths, the highest during the aforementioned period. Recently, many European countries, including Russia, Ukraine, Germany, and Poland, saw a sudden surge in COVID-19 infections, driven by the Delta variant of the COVID-19 virus, in early June 2021. The World Health Organization (WHO) has declared Europe as the epicenter of the pandemic, with the U.K. reporting the highest number of COVID-19 infections. The U.S. has reported around 50 million cases with 800,000 deaths globally. In response to the Omicron variant [15] , North American countries have incorporated travel restrictions and updated vaccination status (see Figure 3 ). Several countries in the Middle East have seen severe outbreaks of the virus. The death toll for Iran is more than 1,30,000. Intending to keep the daily infections due to the Omicron variant under check, many countries have given booster shots to their population [16] . According to the official figures, South Africa is the worst affected continent with more than 3 million COVID-19 cases and around 1 lakh death.

In order to enhance the collaboration and coordination among the National Institute of Health (NIH) and the Department of Defence (DoD), a SARS-COVID- 19 Interagency Group (SIG) was established by the Health Department of the U.S. [18] . The aim was the monitoring of emerging variants and their impact on the countermeasures viz., vaccines and therapeutics, to align with the aforementioned for wellpreparedness against COVID-19 infection. World Health Organization (WHO) categorizes the evolving COVID-19 mutants into three groups, [19] viz., variants of interest, variants of concern, and variants of high consequence as in Table 1 Figure 4 depicts the hotspots of the reported SARS-COVID-19 cases ranging from 10,000 to 10 million, with the size of the bubble determining the magnitude of the reported infected COVID-19 cases.

Infection. Due to the similarity of SARS-COVID-19 with normal flu and pneumonia, testing an individual for COVID-19 is a need to manage the disease effectively. Testing has played a vital role in the first wave of pandemic and continues to do so in the second wave throughout the world to detect whether an individual has contracted the virus or not. Testing an individual for COVID-19 can help identify a disease for many asymptomatic and presymptomatic carriers who drive the pandemic silently without developing any symptoms for a more extended period. There have been a lot of studies that have marked asymptomatic and presymptomatic individuals as significant carriers of the infection, contributing silently to more than 35% of the COVID-19 infections worldwide. The testing techniques for COVID-19 testing, as depicted in Figure 3 , fall into two broader categories ( Figure 5 ), viz., diagnostic tests and serology blood tests/antibody tests [20] .

2.2.1. Diagnostic Tests. These tests are responsible for diagnosing whether an individual is COVID-19 positive. Diagnostic tests directly diagnose the presence of virus in nasal or throat swabs; therefore, diagnostic tests are sometimes referred to as direct tests. Diagnostic tests for COVID-19 can be further subdivided into two categories [21] .

(1) Reverse Transcription Polymerase Chain Reaction (RT-PCR). RT-PCR (reverse transcription polymerase chain reaction) tests, also commonly known as molecular tests, are responsible for detecting the virus in the nasal or throat swab sample collected from a suspected individual. This test works by investigating the presence of COVID-19 RNA (ribonucleic acid) in the sample so collected. If found, this RNA (ribonucleic acid) is converted into DNA (deoxyribonucleic acid) using reverse transcriptase. The resulting DNA (deoxyribonucleic acid) strand is amplified several times to predict the presence of COVID-19 infection in suspected individuals accurately. These tests have specificity and accuracy of more than 75%; however, several studies suggest false negatives reported by RT-PCR (reverse transcription polymerase chain reaction) tests. This might be attributed to the mutations in COVID-19 strains [22] .

(2) Rapid Antigen Tests. These tests are responsible for identifying COVID-19 antigens in the throat or nasal swabs collected from a suspected individual. These tests, however, have more chances of missing out on an active COVID-19 3 BioMed Research International infection than RT-PCR (reverse transcription polymerase chain reaction) tests because of their low sensitivity. For example, suppose individual tests negative in the COVID-19 rapid antigen test report; confirmation on the same needs to be done using the RT-PCR (reverse transcription polymerase chain reaction) test. The test results for rapid tests are usually available within 1 to 2 hours of testing [23] .

Tests. These tests, unlike the diagnostic tests, can detect whether an individual was previously inflicted with COVID-19 infection or not. Antibody tests are responsible for detecting the presence of antibodies in the blood sample taken from a suspected individual. If the subject under study shows a positive antibody test report, it means that the individual has been affected by the virus sometime in the past. The presence of antibodies in the blood sample results from natural immunity (if the person is not vaccinated) or an immune response generated by the immune system to fight against the infection. These tests might also prove useful to investigate the effect of different types of vaccines developed for COVID-19. However, these tests fail to diagnose an active COVID-19 infection [24] . falls in the category of protein subunit vaccines. These vaccines employ a different strategy of using spike proteins of the virus to produce immunity. Still, the small size of the viral fragment can be an issue as it can surpass our immune system unnoticed. Therefore, these vaccines involve the use of multiple vaccine shots in combination with a chemical adjuvant, thereby enhancing the capability of the vaccine to produce an immune response at a measurable rate. 2.3.4. Nucleic Acid Vaccines. DNA (deoxyribose nucleic acid) and mRNA (messenger ribonucleic acid) vaccines fall into the nucleic acid vaccines category. The insertion of genetic code through attachment to molecule directly or by using a gene gun to produce antigens, thereby replacing the need of using virus directly as a delivery system, seems to be the baseline rule for the operation of these vaccines. Pfizer-BioNTech and Moderna are examples of mRNA (messenger ribonucleic acid) vaccines for COVID-19 with reported efficacy of 98%.

The COVID-19 pandemic remains a grave concern even after years of its upsurge. The consecutive waves of the pandemic continue to rage on in full force, ravaging different countries with a vengeance. The alarming rate of confirmed, infected, and death cases continue to see an upward trend, and if not controlled, the entire world will come to a halt sooner than expected, as is clear from rising numbers. The unprecedented and uncontrolled surge in cases in the second wave is attributed to the double variant mutant E484Q and L452R in the B.1.617 COVID-19 strain [27] . The Delta variant, first detected in India in December 2020, remains the most problematic version of the SARS COV-2 virus accounting for nearly all the COVID-19 infections globally, fueled by the unchecked spread of the novel COVID-19 infection in different parts of the world. There is an incredible degree of collaboration on the science side, with However, despite such an unprecedented collaboration on the development and deployment of vaccines, COVID-19 pandemic is still far from over [28] .

Several variants of COVID-19 have been reported. World Health Organization (WHO) classifies Delta as a variant of concern capable of increasing transmissibility, causing more severe disease, or reducing the effect of treatment and vaccines. Though capable of preventing or mitigating the severity of the illness and death, the current vaccines fail to block the infection completely. The virus can still replicate in the nose, even among the vaccinated people who can then transmit the disease further within a population. Henceforth, a new generation of vaccines is required to block the transmission of the infection. For the successive waves of COVID-19 hitting different countries, it has been concluded that outbreaks were easier to contain in places with wellfunctioning testing and tracing systems to quickly catch further episodes before they swell into more infection waves. The countries which succeeded in controlling the reproductive rates and infection rates of COVID-19 in the first wave performed better at mitigating the effects of consecutive waves [29] . Therefore, there is a need to build appropriate modeling strategies with prediction to help the government contain the successive waves of the COVID-19 pandemic with ease and success to ensure minimal loss of lives whilst keeping a check on the rate of transmission of the disease [30] .

Globally, the vaccine doses administered (see Figure 2 ) for different countries remain scarce. However, several additional booster doses have been given to fully vaccinated individuals with the emergence of viral variants.

The following research demonstrates a systematic literature review (SLR) of the articles [31] published between December 2019 and June 2021. In addition, we incorporated a series of inclusion and exclusion criteria to produce infographic tables reviewing the state-of-art techniques to collate information employing COVID-19 prediction modeling. The findings of SLR will help the government and the healthcare practitioners to use the best prediction model governed by the highest prediction accuracy and other performance metrics to contain the successive waves of the COVID-19 pandemic in the future and prevent overwhelming the limited medical healthcare resources.

Systematic literature review (SLR) is an organized and systematic process for identifying, evaluating, and critically analyzing relevant research and collecting and analyzing data from studies that might be used in our study. The objective of SLR is to offer a comprehensive insight into cur-rent research on the formulated research questions. An SLR activity is governed by the development of a review protocol in the planning phase, which consists of five primary stages, viz., formulation of research questions, design of search strategy, and assessment of the literature for quality, procurement of data, and coalescence of data ( Figures 6 and 7) . The first stage consisted of identifying or formulating welldefined research questions within the scope of SLR. The keywords and terminologies were identified, and it was ensured that the research questions or the previous studies were not duplicated in the current SLR. In the second stage, we formulated a search strategy focusing on the studies relevant to the research questions developed in the first stage. This involved formation of a search string using the keywords identified in the first stage and the searching of the identified databases relevant to the topic of research. The third stage comprised of the selection of study describing the inclusion and exclusion criteria for conforming to whether the current research article(s) need to be included for the current SLR or not. The identified articles were subjected to a quality assessment procedure, which included the development of quality checklists to aid in the evaluation. The fourth stage consisted of data extraction from the included studies governed by the inclusion and exclusion criteria decided upon in the third stage. A data extraction form refined through a pilot study was developed in this stage. Finally, the fifth stage involved data coalescence as per their addressal to the research questions identified in the first step [31] .

To elucidate and outline pragmatic evidence on mathematical modeling and machine learning models employed for COVID-19 spread, the current SLR will facilitate to answer the following set of formulated research questions: IRQ1: Which machine learning techniques and compartmental models have been used for predicting the future course of COVID-19?

IRQ2: What is the overall accuracy of the prediction models so employed? IRQ3: What are the critical disease-related parameters and most effective intervention strategies deployed for mitigating the spread of COVID-19 infection?

IRQ4 

String. 2019-nCoV " OR" COVID-19 "OR "SARS-CoV-2" OR "HCoV-2019" OR "hcov" OR "NCOVID-19" OR "severe acute respiratory syndrome coronavirus 2" OR "severe acute respiratory syndrome corona virus 2" OR "coronavirus disease 2019" OR (("coronavirus" OR "corona virus") AND (Wuhan OR China OR novel)) AND "Covid-19" AND "Mathematical Modelling" AND "Artificial intelligence" AND (techniques OR models) AND (Vaccines OR "Herd Immun * " OR "Reproductive rate" OR "Asymptomatic" OR "Machine learn * ") AND ("SIR Model * OR Quarantine OR Lockdown).

Engines. Six electronic databases, viz., PubMed, Springer Link, IEEE Explore, Web of Science and Google Scholar, Science Direct, and Web of Science, were used as the sources of information for collating articles related to the COVID-19 pandemic ( Figure 8 ). The previously created search string was used to narrow the search in the specified databases. The preceding string was adjusted so that it could be included in various databases based on their syntax. All selected databases were searched using titles and keywords, full text, and abstracts; however, Google Scholar was searched using keywords and abstracts to minimize duplication of retrieved records.

To minimize the selection bias and the duplication of results, an effective and exhaustive wellorganized search of all the relevant sources is a must for an SLR. Therefore, the initial search process (ISP) has been divided into two phases:

ISP1: This phase involves gathering the candidate set of articles collected by searching identified databases.

ISP2: This phase comprises the identification of relevant references from the candidate set of articles of phase 1 and the addition of the same to the articles of phase 1 if found apropos.

After applying these two initial search phases, Mendeley (http://www.mendeley.com) was employed to organize and manage the search results. The search process was further refined at each stage, subject to many scrutinizations as in Figure 8 .

Selection. An enormous number of databases are available for extracting and gathering information related to the chosen domain of study. However, even after selecting specific databases for retrieval of articles, the duplication and irrelevancy in the search process conducted cannot be omitted. Therefore, there is a need further to refine the search study to the next level to minimize the selection of trivial articles. The study selection phase involves applying two steps, viz., inclusion and exclusion criteria and quality assessment check for finalizing relevant articles for study. The inclusion and exclusion criteria are used for the candidate set of articles gathered in the initial search phase 1 (ISP1) to facilitate the search results further.

Furthermore, the quality assessment criteria are established and practiced for these articles. This results in the selection of articles with a fair chance of answering the formulated research questions, which can then be employed to extract data. The secondary search is divided into the following two phases ( Figure 8 ):

ASP1: this phase scrutinizes the candidate set of articles selected in the search process based on the inclusion and exclusion criteria. The articles possessing the capability to answer the formulated research questions are deemed relevant.

ASP2: this phase applies the quality assessment criteria on the set of relevant papers gathered in SSP1. Also, the set of relevant articles is searched for references relevant to (e) Exclude articles that do not prescribe any machine learning or mathematical modeling technique (f) The initial search retrieved a total of 1178 candidate sets of articles. The investigation was further refined by applying inclusion and exclusion criteria which deemed 162 articles to be relevant. A secondary search was initialized for these articles to highlight the appropriate references and include relevant articles. It was concluded that secondary search led to the identification of 7 additional papers pertinent to the study taking the score of relevant papers to 169. Finally, a quality assessment checklist was applied to these articles, which fetched 50 articles as final for performing SLR ( Figure 8 ). These articles were then used for the procurement of data 3.3.2. Quality Evaluation of Selected Articles. Eight quality assessment questions were mapped out to evaluate the plausibility and relevance of the articles selected for study (Table 3) . Three possible answers were calibrated for each question: yes, partly, and no. A scoring technique was employed on these quality questions where the answers could be scored as "Yes =1," "Partly =0.5," and "No = 0." As a result, we obtained through the sum of all scores of responses to the quality assessment questions was deduced.

The articles with a quality score greater than four were deemed relevant with an acceptable quality grade. This (Table 4 ).

3.4. Threats to Validity. The gauging of threats to the review protocol's validity is critical to ensure that the final set of selected articles considered for review are of acceptable quality. Three primary challenges to the credibility of this review methodology have been reviewed, viz., article selection bias, publication bias, and probable information gathering inaccuracy.

The selection of publications for review involves the identification of key terms apropos to answer the formulated research questions. The next step consists of using subsequent strings or words for searching in the database engines identified for this SLR. However, it might so happen that titles, abstracts, or keywords of some relevant articles might not contain keywords in alignment with the aforementioned key terms.

In order to avert this bias in article selection, a manual search of COVID-19 articles in dimensions was conducted to ensure that the chances of missing out on papers relevant to this review are minimum. Also, the lookup of significant references in the selected papers and the application of inclusion and exclusion criteria in strict compliance with the identified research questions helped curb this threat to a reasonable extent. Finally, two reviewers were assigned for study selection, and the disagreements among them were resolved through consensus to stave off the study selection bias further. However, it is plausible that some of the relevant studies might be overlooked. We presume the numbers so reported to be relatively small for such cases.

Publication bias in the form of outcome reporting bias, gray-literature bias, and language bias is bound to coexist in our research. For example, outcome reporting bias dictates the publishing of positive results concerning probabilistic models in more numbers than negative results, leading to overestimating performance results. To alleviate the outcome reporting bias, some of the chosen articles report both positive and negative comparisons concerning applying the different probabilistic models employed for publication. In addition, the exclusion of gray literature (government reports, thesis reports, etc.) paves the way for the existence of publication bias ineluctably.

Finally, to suppress the risk of inaccurate extraction of information, a reevaluation scheme was enforced on the selected articles to identify true positives. This situation arises when the title of the chosen study dictates significance, but the contents are deemed insignificant to answer the formulated research questions. A quality assessment criterion was established through the formulation of quality assessment questions. All the authors rated the quality questions independently and reached consensus, resolving conflicts and achieving similarity in the context of rating.

The application of the quality assessment criteria on the selected set of articles furnished 50 articles of considerable quality. These articles were subjected to data extraction procedure to fetch the following significant information, viz., In order to extract and gather information from the selected articles to answer the different research questions, two types of data synthesis techniques were employed, viz., narrative synthesis, reciprocal translation, and indirect translation. For addressing RQ1 to RQ3, narrative synthesis was employed, to display and disseminate the data on the identified research questions. In addition, different types of 

4.1. Characteristics of Selected Articles. Modeling approaches, key epidemiological parameters, and intervention strategies for COVID-19 are as follows: different modeling approaches were proposed, evaluated, and analyzed for articles under study ( Figure 9 ). These models were classified into four categories: data-based mathematical models, machine learning models, ARIMA and regression, and hybrid models.

Compartmental models assign a group of populations to different labeled compartments for modeling an infectious outbreak [39] . These models employ a mix of complicated integrodifferential mathematical equations, thereby aiding in the realization and plotting of various disease-related parameters, viz., infection rates, recovery rates, incubation period, latent period, and reproductive rate [40] . In addition, the impact of different intervention strategies, viz., quarantine, lockdown, and travel restrictions, can also be studied using these models by incorporating the appropriate compartment in the basic SIR model.

Learning Models. Different machine learning algorithms, viz., support vector machines (SVM), random forests (RF), gradient boosting trees (GBT), logistic regression, and neural networks might be employed to predict the chance of COVID-19 infection in a population ( Figure 10 ). To effectively track COVID-19 patients in hospitals at early stages, as shown in Figure 5 , X-ray images of patients are scanned with the help of efficient machine learning algorithms, and this has assisted in clinical decision making at the early stages of the pandemic throughout the world [41, 42] . The use of machine learning algorithms not only limits the burden on limited healthcare resources but also helps deliver better treatment outcomes [43, 44] 13 BioMed Research International boon for healthcare practitioners to develop optimal policies to control or mitigate the spread of COVID-19 in various settings [59] . The mathematical formulae devised using statistical modeling can help predict the future course of infections which can aid in optimal policymaking. Different machine learning algorithms, viz., support vector machines (SVM), random forests (RF), gradient boosting trees (GBT), and logistic regression can work efficiently in tandem and close proximity with explicit differential equations devised through modeling that might help in future forecasting of the pandemic as shown in Figures 2 and 6 , based on historical patterns of data in different settings [60, 61] . Incorporating various disease-related parameters and variables into statistical models might provide insights into the dynamics of disease transmission, and this might prove helpful in future forecasting of the disease. The models so developed through the integration of mathematics and AI technologies will help investigate the effects of various interventions like quarantine, testing, drugs, vaccination, and their relative impacts on flattening the curve.

Furthermore, the models mentioned above identified for prediction covered four aspects of study, viz., gaining insights into the transmission dynamics with or without predicting the course of COVID-19 infection in advance (infection rates, recovery, and death rates), metrics employed by prediction models for assessing performance, assessment of various disease-related parameters of COVID-19, the efficacy of reported vaccines on mutating COVID-19 strains, and gauging the impact of various intervention measures on the spread of COVID-19.

From the selected studies, 64% of the studies include mathematical models for modeling infections. Over 2.48% of the articles employed a single machine learning algorithms for study, whereas 26% employed multiple machine learning algorithms, viz., support vector machine (11%), random forest (6%), decision trees (4%), gradient boosting algorithm (3%), AdaBoost (1%), and XGBoost (1%) (Figure 10 ). Nearly 40% of the articles studied and calibrated the basic SIR model, with around 35% of the models predicting the trend of this infectious spread. Around 30% of the selected studies modeled the effect of various intervention measures, viz., lockdown (8%), quarantine (16%), travel restrictions (3%), asymptomatic cases (3%), and on the infection rates ( Table 5) .

The selected studies constituted only 5% of hybrid models employing compartmental and machine learning algorithms. It is concluded that around 43% of the chosen studies predicted the trend of COVID-19 spread, whereas 38% of the articles focused on the study of various parameters, viz., reproductive rate (26%), case fatality ratio (6%), herd immunity (4%) concerning epidemiology, and their effect on curbing the infection rates or flattening of COVID-19 infection curve. Also, ARIMA and regression accounted for nearly 3% of the articles under study. Figure 11 portrays the forecasting dynamics employed by the different articles under study.

Employed? The articles selected for study employed either compartmental models or machine learning models or a combination of both to project the infection rates. For studies using mathematical modeling, viz., SIR, SIER, and SIRD models, the prediction accuracy is quite difficult to anticipate in advance. These models consider several assumptions while modeling, which may or may not go well with different settings. These assumptions are nothing but idealization and approximation of what is happening in reality. Therefore, it is vague to expect valuable predictions from such models, which are incapable of mirroring different facets of reality. For example, Lourenço et al. [48] employed a SIR (Susceptible, Infected, Recovered) model to study the severity of the spread of COVID-19 in the U.K. and Italy. The study predicts the infliction of 60% of the U.K. population with the virus by 19 March 2020 at R0 = 2:25. The SIR model dictates that the number of susceptibles S(0) should be marginally less than the ratio a/b to prevent an epidemic, where a is the recovery rate and b is the transmission rate. This implies that even before the arrival of mutant strains, a certain fraction of the population should be vaccinated to reduce the initial number of individuals susceptible to infection, thereby maintaining S ð0Þ < a/b. However, this underestimates herd immunity which dictates that herd immunity can only be achieved if the pandemic spreads in more than 95% of the population. Also, Gupta [43] employed a SIER model to predict the future trend of COVID-19 in India for three weeks. However, the predictions made cannot be expected to be 100% accurate because of deviations, viz., underreporting of COVID-19 data, assumptions withheld while formulating the model.

Similarly, Kyrychko et al. [47] suggested a variation of the SIER (Susceptible, Infected, Exposed, Recovered) model to model the effect of COVID-19 infection on recovery and death rate in Ukraine. The study suggested an increase in both infection and death rates with time without appropriate mitigation strategies in individuals with the age group 60-70. However, later, it was found out that young people within the age group 25-35 were the most affected. Therefore, it is difficult to anticipate the reliability of such predictions. Furthermore, the mathematical models of [51] assume a homogenously mixing population, which is quite vague as it is implausible that all individuals have an equal probability of getting in contact with other individuals. Also, the model of [55] , which has been validated for a large population, might fit well for cities. Still, the deduction of results through different equations will lead to unrealistic results for villages.

Moreover, the results of [59] assume an exponential distribution of infection and overlook the period from the onset of symptoms to recovery or death, which is quite unrealistic. For studies employing more than one model for prediction, the fitted parameters and results deduced might be considered valid and robust, for example. Still, the notion of prediction for referring parameters dictated by the model should be restricted for studies relying on only one model. Table 6 depicts the Quality Evaluation Metrics employed by the articles selected for SLR of COVID-19.

14 BioMed Research International Early estimation of various parameters concerning epidemiology and predicts the value of the reproductive ratio, R0, to be 3.1 for China.

Hong et al. [36] Evaluating the value of R0 to enhance the effectiveness of various policies for early control of the pandemic for China.

Zhong et al. [37] Modeling the infection and removal rates of COVID-19 and prediction of the cumulative COVID-19 cases for China

Hassanein et al. [38] Detection of the presence of COVID-19 infection in the lungs at early stages. SVM

De Moraes et al. [39] Employing machine learning algorithms to prioritize the infected cases for receiving the RT-PCR (reverse transcription polymerase chain reaction) tests in case of limited testing resources.

Machine learning models: SVM, RF, GBT, and LR models Zoabi et al. [40] Prioritizing the infected cases for receiving the RT-PCR (reverse transcription polymerase chain reaction) in case of limited testing resources.

Farooq et al. [41] Understanding the trend of infectious spread for the worst-hit states of India. SIR model Dos Santos Santana et al. [42] Prioritization of the infected cases for receiving the RT-PCR (reverse transcription polymerase chain reaction) tests in case of limited testing resources.

Machine learning models: SVM, RF, GBT, LR, and DT models Gupta et al. [43] Computation of reproductive ratio, R0, to predict the future trend of COVID- 19 for three weeks' time.

Anderez et al. [44] Modeling the mortality rate for people who are already vulnerable to infection due to their advanced age or existing combordities before getting exposed to the virus.

Goodman-Meza et al. [45] Various features and different clinical diagnoses like complete blood counts (CBC's) and various inflammatory markers are studied to diagnose a person as being COVID-19 positive or negative.

Machine learning and ensemble models: SVM, LR, RF, AdaBosst, and XGBoost

Ndaïrou et al. [46] Graphing the number of confirmed, recovered, and fatality cases using a dataset within a stipulated time period.

Kyrychko et al. [47] Exploration of the impact of lockdown strategies on infection and death rates. SIER Lourenço et al. [48] Study the severity spread of COVID-19 via modeling of infections. SIR model 15 BioMed Research International The study explores the impact of lockdown strategies on infection, recovery, and death rates.

Anastassopoulou et al. [52] Modeling the effect of various parameters, viz., CFR (case fatality ratio), R0 (reproductive ratio) related to COVID-19 epidemiology on the infection, death, and recovery rates. Nonetheless, the benefit that these mathematical models offer in terms of early prediction of infection, death, and recovery rates and the development of policies as far as control of pandemic is concerned cannot be overlooked. Several mathematical models employed the technique of multiple factor optimization to account for the bias in calculations caused due to underreporting of data. For example, Anastassopoulou et al. [52] employed the SIRD (Susceptible, Infectious, Recovered, Dead) model to study the effect of various parameters, viz., CFR (case fatality ratio), R0 (reproductive ratio) related to COVID-19 epidemiology on the infection, death, and recovery rates for Hubei, China. The projected average value of R0 is determined to be 2.6, premised on SIRD simulations. The simulations have been repeated by considering the number of infected cases multiplied by a factor of 20 and the number of recovered cases multiplied by 40 to account for the bias in calculations caused due to underreporting of asymptomatic or presymptomatic patients. Around 38% of the compartmental models employed for studying the dynamics of COVID-19 include stability and sensitivity analysis of various parameters to account for the division and allocation of different sources of uncertainty in inputs to the uncertainties of the output to justify the reliability of results. These models are quite helpful while incorporating the effect of various intervention strategies, viz., lockdown, quarantine, and the role of international travel, on the curve of epidemic. To gauge the accuracy of mathematical models, while some parameters are assumed, others are deduced by fitting the model with datasets.

With a view to gauge the capability of prediction models for the spread of COVID-19 employing machine learning, Shin [69] Modeling through multistage transitions to understand the dynamics of three successive waves of COVID-19 transmission.

Li et al. [70] Modeling various disease-related parameters, viz., reproductive rate, incubation period, transmission rate (TR), and time to hospitalization (TSOH).

De La Sen et al. [71] Study the effect of partial and total quarantine of both infectious and susceptible populations without the inclusion of demography and mortality on the transmission rate of COVID-19.

Abbasi et al. [72] Study the effect of quarantine on the infection and recovery rates of COVID-19. SQEIAR Khanday et al. [73] Prediction of the number of infected COVID-19 cases-recovered and death cases. modeling has to be guided by performance metrics. These evaluation metrics enable the quantification of performance dictated by machine learning models. Different algorithms are elucidated, and hyperparameters are tuned with the involvement of a distinct set of decided-upon features. Accuracy, precision, ROC/AUC, sensitivity, specificity, F1 score, recall, and Brier score are some of the performance metrics for evaluating the developed predictive model. More than 80% of the ML models employed for COVID-19 spread use specificity, sensitivity, accuracy, precision, and recall to evaluate performance. Hassanein et al. [38] suggested the use of SVM (support vector machine) to diagnose whether an individual is inflicted with COVID-19 or not. The reported accuracy, specificity, and sensitivity of 97.5, 99.7, and 95.8 have been reported. Likewise, De Moraes [39] studied SVM, RF, GBT, and logistic regression for COVID-19 spread. Out of all these algorithms, SVM and random forest reported similar AUC, sensitivity, and specificity of 0.851, 0.677, and 0.850.

Mitigating the Spread of COVID-19 Infection? The probabilistic models employed for understanding the dynamics of COVID-19 spread have been used to deduce several disease-related parameters, viz., case fatality ratio (CFR), reproductive rate (R0), transmission rates, infection rates, recovery rates, asymptomatic infection rate, and herd immunity ( Figure 12 ). In addition, several control measures and their effect on the infection rates have been studied ( Figure 13 ). Around 45% of the developed models incorporated the estimation, assumption, and effect of varying R0 of COVID-19 to gain insights into the transmission dynamics of COVID-19, whereas only 3% of the selected articles focused on the deduction of CFR, an important parameter for understanding the severity of the disease. McGoogan and Wu [32] estimated the CFR in China to be equal to 2.4% on 12 th February 2020. Also, Wu. et al. [34] employed a SIER model for understanding the trend of infectious spread of COVID-19 for major cities of China and deduced R0 to be roughly equal to 2.7 using Monte Carlo simulations. Read et al. [35] studied the early estimation of various parameters and predicted R0 = 3:1 assuming Poisson's distribution for the infectious cases in Wuhan, China. Table 7 lists multiple disease-related parameters and intervention measures considered by the articles under study. Estimating infection and death rates for a predefined interval is an important exercise to ensure well-preparedness in advance for mitigating COVID-19 infection. Around 39% of the selected studies model the infection and death rates of COVID-19.

Anastassopoulou et al. [52] considered a SIRD model for simulating the total number of infections and predicted the number of infections to cross 2 lakhs by February. Also, the death toll is expected to cross 2,800 by the end of February 2020. Around 3% of the studies modeled the effect of asymptomatic individuals on the growth curve of the epidemic. The time-varying SIR model of Peng et al. [68] confirmed a 20% contribution of asymptomatic infections to the total infections. Also, Tomochi and Kono [49] included a compartment I in the basic SIR model and reported asymptomatic infections to account for 15% of the COVID-19 infections.

Around 45% of the models considered for the projection of infection rates studied the effect of quarantine on the infected cases. Zhong et al. [37] predicted a reduction in peak infectives by 40-50% through the deployment of quarantine regime at 20% in China. The lockdown and social distancing regime is modeled by around 25% and 13% of the selected articles. Khan et al. [50] concluded that a reduction 

BioMed Research International The study concluded a 60% reduction in infections through the implementation of social distancing alone. Mandal et al. [54] employed variation of SIR (Susceptible, Infectious, Recovered) model called the SEQIR model to predict the effect of quarantine on the reproductive ratio, Figure 15 ). This concludes that while the vaccines show a decrease in their efficacy against the B .1.1.7 and B.1.617.2 variants and chances of getting reinfections with both the variants; however, protection of 80-90% is still expected with vaccineinduced antibodies.

Pouwels [80] Figure 16 ) is more capable of neutralizing vaccine-induced antibodies than the B.1.1.7 variant.

Continue the Research Path? The COVID-19 virus is mutating at a fast pace. The Delta variant is classified by the World Health Organization (WHO) as a variety of concern capable of increasing transmissibility, producing more severe illness, or limiting the effectiveness of therapy and vaccinations [77] . While capable of avoiding or lessening the severity of the disease and mortality, the present vaccinations do not block the infection completely.

The new COVID-19 variant Omicron (B.1.1.529) has been identified in South Africa, and it has become a sauce of concern for many countries in terms of its virulence and transmissibility. This variant carries 50 mutations (32 on spike protein) which could further drive the consecutive waves of the COVID-19 while Delta had two mutations The Omicron variant is spreading faster with a doubling time between 1.5 and 3 days (Figure 17 ) in countries with documented community transmission. Omicron has a substantial growth advantage over delta [15] . The number of infections has increased tenfold since October 2021 ( Figure 18 ). There is a sudden surge in cases and probably an increase in hospitalizations and deaths, thereby draining the hospital capacity ( Figure 19 ). Booster campaigns and new social restrictions might help in keeping the infections at bay. The quarantine regime is again followed for individuals traveling from the affected countries where the individuals inflicted with this mutant strain have been found, and WHO suggests the widespread use of boosters for protection against this variant. As such, COVID-19 is far from over.

The shared data is playing a pivotal role in the global efforts to combat the spread of COVID-19. Therefore, there needs to be a collaboration on public and private platforms to review the global data for testing potential, treatment vaccines, and therapeutics [79] , and the more the research concerning the different facets of COVID-19 spread, the more the expertise gained at developing optimal solutions to roadblock the progression of COVID-19. Henceforth, this justifies the need for continuous research focused on mitigating the effects of virulent replicating strains of COVID-19 [81] .

The different ML models employed by the articles considered for this SLR have included different performance metrics to evaluate the accuracy of the prediction models. Besides this, several other factors contributing to the accuracy, viz., generality, decipherability, and accountability, have not been considered by this review. Due to the difference in experimental designs, the accuracy of reported results is difficult to anticipate, subject to the conditions taken into account while generalizing models. No comparison between ML and mathematical models has been contemplated. This inconsistency might be attributed to the limited number of articles considered for this study. Also, the compartmental models reviewed employ many assumptions while modeling the COVID-19 pandemic. However, these assumptions change with the emergence and availability of new data. With this, projections are subject to change; 

BioMed Research International hence, one model might be plausible under certain conditions but might be deemed unfit in other scenarios. No metric has been evaluated/reported to gauge this inconsistency. The reported mathematical predictions, viz., infection numbers vary significantly with the changing nature of mutating COVID-19 viral strain, which questions the understandability, application, and reliability of these models in different scenarios. Some accentuating limitations of this SLR are as follows:

(a) GIGO (Garbage In, Garbage Out) was overestimated while performing this SLR (b) Heterogeneity concerning statistical assessment was outperformed in the current SLR (c) Meta-analysis methods were underestimated in the studies selected for conducting this SLR (d) Nonstandardization of assessment methods in the studied articles could not be wholly avoided (e) Generalization of results from an SLR to contexts not studied would report issues (f) Publication and language bias could not be completely eliminated (g) The reported results were sensitive to the size of the studies selected for this SLR

Predictive modeling is a must to contain the devastating delta strain of the virus at this lethal stage of COVID-19. The reported results from SLR encompass and summarize diverse models and techniques used for analyzing the dynamics of the spread of COVID-19. Around (35%) of the selected studies enlisted dynamics reporting COVID-19 case numbers, (30%) modeled the effect of intervention measures, and (20%) estimated the different disease-related parameters concerning COVID-19. Only (10%) and (5%) of the studies focused on testing strategies and vaccine effectiveness, respectively, for COVID-19. The current SLR shows a positive effect of BNT162b2, ChAd0x1, and mRNA-1273 on B.1.1.7 and B.1.617.2 viral variants and suggests administering additional booster doses for immunosuppressant individuals or normal individuals to make up for the deficit of waning antibodies given current continuously evolving new variants of COVID-19. Most of the models used 95% CI for predicting cumulative cases over a predefined interval. The findings of SLR suggest that predictions made by different models are essential to understand the course of the COVID-19 pandemic, subject to QEM used by each. The results from performance metrics used by each show that random forest (RF) and support vector machine (SVM) performed better for predicting COVID-19 case numbers followed by decision tree (DT), linear regression, and gradient boosting algorithm (GBA). Moreover, this systematic review suggests using the SIR model to incorporate various disease parameters. This would help in gauging the impact of different interventions for con-trolling the pandemic and modeling the vaccination, which seems to be the most important for this global emergency. However, given a scenario, it is pretty tricky to anticipate which model will perform the best because of the continuous change in the dynamics of the COVID-19 virus and the dataset chosen for study. The machine learning algorithms might be integrated with deep learning algorithms to project COVID-19 infection cases in advance, and mathematical modeling might be used to study the effect of control measures on the infection rates.

Publicly available datasets were analyzed in this study. These datasets can be found at https://github.com/ CSSEGISandData/COVID-19. Detailed links are given in references.

Forecasting incidences of COVID-19 using Box-Jenkins method for the period July 12-Septembert 11, 2020: A study on highly affected countries

Investigating a nonlinear dynamical model of COVID-19 disease under fuzzy caputo, random and ABC fractional-order derivative

Impact assessment of containment measure against COVID-19 spread in Morocco

Linear regression analysis to predict the number of deaths in India due to SARS-CoV-2 at 6 weeks from day 0 (100 cases

A time-varying SIRD model for the COVID-19 contagion in Italy

An approximation-based approach for periodic estimation of effective reproduction number: a tool for decisionmaking in the context of coronavirus disease 2019 (COVID-19) outbreak

Estimation of the time-varying reproduction number of COVID-19 outbreak in China

Medical image-based detection of COVID-19 using deep convolution neural networks

Projecting the transmission dynamics of SARS-CoV-2 through the post-pandemic period

Estimating uncertainty and interpretability in deep learning for coronavirus (COVID-19) detection

Stability issues of RT-PCR testing of SARS-CoV-2 for hospitalized patients clinically diagnosed with COVID-19

COV-IDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on chest X-ray images

Automated detection of COVID-19 cases on radiographs using shapedependent Fibonacci-p patterns

Analysis and predictions of spread, recovery, and death caused by COVID-19 in India

Omicron SARS-CoV-2 variant: a new chapter in the COVID-19 pandemic

An explainable system for diagnosis and prognosis of COVID-19

40JHU CSSE COVID-19 dataset. Daily reports

Pandemic printing: Evaluation of a novel 3D printed swab for detection of SARS-CoV-2

Infectivity upsurge by COVID-19 viral variants in Japan: evidence from deep learning modeling

Implementation of an efficient SARS-CoV-2 specimen pooling strategy for high throughput diagnostic testing The rapid identification and isolation of infected individuals, remains a key strategy for controlling the spread of SARS-CoV-2. Frequent testing of populations to detect infection early in asymptomatic or presymptomatic individuals can be a powerful tool for intercepting transmission

Performance and cost-effectiveness of a pooled testing strategy for SARS-CoV-2 using real-time polymerase chain reaction in Uganda

Clinical evaluation of a multiplex real-time RT-PCR assay for detection of SARS-CoV-2 in individual and pooled upper respiratory tract samples

Enhanced throughput of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) real-time RT-PCR panel by assay multiplexing and specimen pooling

Development and performance evaluation of the first in-house multiplex RT-PCR assay in Bangladesh for highly sensitive detection of SARS-CoV-2

Association between mRNA vaccination and COVID-19 hospitalization and disease severity

mRNA vaccineelicited antibodies to SARS-CoV-2 and circulating variants

Recognition of variants of concern by antibodies and T cells induced by a SARS-CoV-2 inactivated vaccine

Safety and efficacy of the BNT162b2 mRNA Covid-19 vaccine

Neutralization of VUI B.1.1.28 P2 variant with sera of COVID-19 recovered cases and recipients of Covaxin an inactivated COVID-19 vaccine

Differential effects of intervention timing on COVID-19 spread in the United States

Systematic literature reviews in software engineering -A systematic literature review

Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China

The impact of influenza vaccination on the COVID-19 pandemic? Evidence and lessons for public health policies

Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study

Novel coronavirus 2019-nCoV: Early estimation of epidemiological parameters and epidemic predictions

Estimation of time-varying reproduction numbers underlying epidemiological processes: a new statistical tool for the COVID-19 pandemic

Early prediction of the 2019 novel coronavirus outbreak in the mainland China based on simple mathematical model

Automatic X-ray COVID-19 lung image classification system based on multi-level thresholding and support vector machine

COVID-19 diagnosis prediction in emergency care patients: a machine learning approach

Machine learningbased prediction of COVID-19 diagnosis based on symptoms

A deep learning algorithm for modeling and forecasting of COVID-19 in five worst affected states of India

Classification models for COVID-19 test prioritization in Brazil: Machine learning approach

Machine Learning Models for Government to Predict COVID-19 Outbreak, Digital Government: Research and Practice

A covid-19-based modified epidemiological model and technological approaches to help vulnerable individuals emerge from the lockdown in the Uk

A machine learning algorithm to increase COVID-19 inpatient diagnostic capacity

Since January 2020 Elsevier has created a COVID-19 resource center with free information in English and Mandarin on the novel coronavirus COVID-19

Mathematical Modeling of the Dynamics and Containment of COVID-19 in Ukraine

Fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the SARS-CoV-2 epidemic

A mathematical model for COVID-19 pandemic-SIIR model: effects of asymptomatic individuals

A Predictive Model for COVID-19 Spread -With Application to Eight US States and how to End the Pandemic

Mathematical model to analyze the effect of quarantine on spread and containment of COVID-19

Data-Based Analysis, Modeling and Forecasting of the COVID-19 Outbreak

propagation analysis of COVID-19: an SIR model-based investigation of the pandemic

A model based study on the dynamics of COVID-19: prediction and control

COVID-19 outbreak in India: an SEIR model-based analysis

A SIR-Poisson model for COVID-19: evolution and transmission inference in the Maghreb central regions

Simulating the Progression of the COVID-19 Disease in Cameroon Using SIR Models

A SIR-type model describing the successive waves of COVID-19

Extensions of the SEIR model for the analysis of tailored social distancing and tracing approaches to cope with COVID-19

Prediction of the COVID-19 Epidemic Trends Based on SEIR and AI Models

Adaptive susceptible-infectious-removed model for continuous estimation of the COVID-19 infection rate and reproduction number in the United States: modeling study

Forecasting COVID-19 epidemic in India and high incidence states using SIR and logistic growth models

Analysis and Prediction of COVID-19 Using SIR, SEIQR, and Machine Learning Models: Australia, Italy, and UK Cases

Forecasting COVID-19 Confirmed Cases, Deaths, and Recoveries: Revisiting Established Time Series Modeling through Novel Applications for the USA and Italy

Caputo SIR model for COVID-19 under optimized fractional order

The introduction of population migration to SEIAR for COVID-19 epidemic modeling with an efficient intervention strategy

Modeling and forecasting of COVID-19 using a hybrid dynamic model based on SEIRD with ARIMA corrections

Estimating unreported COVID-19 cases with a time-varying SIR regression model

A multi-stage SEIR(D) model of the COVID-19 epidemic in Korea

An evaluation of COVID-19 transmission control in Wenzhou using a modified SEIR model

On confinement and quarantine concerns on an SEIAR epidemic model with simulated parameterizations for the COVID-19 pandemic

Optimal control design of impulsive SQEIAR epidemic models with application to COVID-19

Machine learning based approaches for detecting COVID-19 using clinical text data

Machine learning models for covid-19 future forecasting

Use of a modified SIRD model to analyze COVID-19 data

Forecasting of the COVID-19 pandemic situation of Korea

Effectiveness of the Pfizer-BioNTech and Oxford-AstraZeneca Vaccines on Covid-19 Related Symptoms, Hospital Admissions, and Mortality in Older Adults in England: Test Negative Case-Control Study

BNT162b2 and mRNA-1273 COVID-19 Vaccine Effectiveness against the SARS-CoV-2 Delta Variant in Qatar

Effectiveness of Covid-19 vaccines against the B.1.617.2 (Delta) variant title

Effect of Delta variant on viral burden and vaccine effectiveness against new SARS-CoV-2 infections in the UK

SARS-CoV-2 B.1.617.2 Delta variant emergence and vaccine breakthrough

The authors declare that they have no conflict of interest to report.