key: cord-0881456-m8da7f9m authors: Davidson, Lena; Boland, Mary Regina title: Towards deep phenotyping pregnancy: a systematic review on artificial intelligence and machine learning methods to improve pregnancy outcomes date: 2021-01-06 journal: Brief Bioinform DOI: 10.1093/bib/bbaa369 sha: 99e6c82963405e145e0d491bb00bdc22f8a31387 doc_id: 881456 cord_uid: m8da7f9m OBJECTIVE: Development of novel informatics methods focused on improving pregnancy outcomes remains an active area of research. The purpose of this study is to systematically review the ways that artificial intelligence (AI) and machine learning (ML), including deep learning (DL), methodologies can inform patient care during pregnancy and improve outcomes. MATERIALS AND METHODS: We searched English articles on EMBASE, PubMed and SCOPUS. Search terms included ML, AI, pregnancy and informatics. We included research articles and book chapters, excluding conference papers, editorials and notes. RESULTS: We identified 127 distinct studies from our queries that were relevant to our topic and included in the review. We found that supervised learning methods were more popular (n = 69) than unsupervised methods (n = 9). Popular methods included support vector machines (n = 30), artificial neural networks (n = 22), regression analysis (n = 17) and random forests (n = 16). Methods such as DL are beginning to gain traction (n = 13). Common areas within the pregnancy domain where AI and ML methods were used the most include prenatal care (e.g. fetal anomalies, placental functioning) (n = 73); perinatal care, birth and delivery (n = 20); and preterm birth (n = 13). Efforts to translate AI into clinical care include clinical decision support systems (n = 24) and mobile health applications (n = 9). CONCLUSIONS: Overall, we found that ML and AI methods are being employed to optimize pregnancy outcomes, including modern DL methods (n = 13). Future research should focus on less-studied pregnancy domain areas, including postnatal and postpartum care (n = 2). Also, more work on clinical adoption of AI methods and the ethical implications of such adoption is needed. In the field of medicine, the theory of 'joint decision-making' between humans and artificial intelligence (AI) holds the is a subset of AI, in the field of computer science. ML refers to a number of methods and algorithms, and different learning types: supervised, unsupervised and reinforcement learning [1, 2] . In the early days of AI in medicine, AI systems were standalone systems without direct connection to electronic health records (EHRs). Today, medicine is progressing towards a learning health system in which knowledge derived from information EHR data can be directly applied to care. Digitized clinical data in the EHR, genomics and biology present a wealth of information, open new opportunities and come with new challenges. AI and informatics methodologies are critical to enabling the learning health system. One of the ways that AI and ML can be used in healthcare is in enabling 'deep phenotyping'. In medicine, a disease phenotype refers to a deviation from healthy morphology or physiology [3] . Study of phenotypes requires knowledge of the spectrum of phenotypes associated with a disease entity. Defining what constitutes a diseased phenotype versus a healthy phenotype is often challenging. Deep phenotyping disease is a step towards precision medicine in which a comprehensive and precise phenotyping of disease presentation takes place. The individual components of the phenotype are observed, described and analyzed in order to develop knowledge of human disease. AI and ML methodologies naturally apply to characterize phenotypes in a 'deep' manner including multiple methodologies (e.g. genetics, imaging, diagnostics and so forth). These methods can exhaustively examine data with high granularity and dimensionality and make use of the broad range of data types that may be processed in deep phenotyping. For instance, nuanced phenotypic traits may be more readily available in unstructured data (e.g. clinical notes), requiring natural language processing (NLP) to identify relative information [4] . Several diseases and adverse outcomes during pregnancy (e.g. preterm birth, preeclampsia and miscarriage) have complicated and difficult to understand etiology, leaving little to be done for prevention [5] [6] [7] . Deep phenotyping these patient states (i.e. pregnancy phenotypes) could help improve adverse outcomes and provide further insight into diseases during pregnancy. AI and ML methods can be employed to enable deep phenotyping, especially with regards to the pregnancy state where many different data types are used (e.g. ultrasound imaging, diagnostic screening, fetal monitoring, genetics). AI and ML methods in medicine are an emerging field and have been described in detail of its theory and current applications across several medical disciplines [8, 9] and many disease areas and clinical states [8, 10] . However, the use of AI to improve women's health, specifically during the pregnancy, has had limited clinical use. In the 2019 International Medical Informatics Association Yearbook of Medical Informatics, there are no research articles focused on pregnancy and maternal health, illustrating the lack of research focus on this important aspect of women's health [11] . For researchers and clinicians alike, AI techniques have promise to derive sound results and improve care at each stage of pregnancy [12] [13] [14] . Overall, the purpose of this study is to systematically review the ways that AI and ML methodologies can inform patient care during pregnancy and improve outcomes. We seek to (a) describe which medical fields and informatics areas apply AI and ML, (b) find where in pregnancy are these methodologies used, (c) describe clinical decision support systems (CDSSs) that employ AI or ML and (d) identify literature gaps for future research. We used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines when conducting our literature review [15] . See Supplementary Table available We searched three databases: EMBASE, PubMed and SCOPUS. PubMed is a freely available database housing publications maintained by the United States National Center for Biotechnology Information. PubMed facilitates searching across three National Library of Medicine resources; the largest component is MEDLINE, followed by content from PubMed Central (PMC) and Bookshelf [16] . EMBASE is a bibliographic database focused on pharmacovigilance. SCOPUS is Elsevier's publication database containing articles from over 36 000 peer-reviewed journals. We used site licenses from the University of Pennsylvania libraries to search SCOPUS and EMBASE. The initial query was broad: in PubMed, we searched the keywords in all fields; in SCOPUS, we searched article title, abstract and keywords; and in EMBASE, we used quick search. On 5 December 2019, we used the following search query to cast a wide net of potentially relevant papers for inclusion of our review: AI AND pregnancy. On 18 February 2020, we completed a more focused query, in order to identify papers that focused on ML: informatics AND pregnancy AND AI AND ML. The PubMed interface uses Automatic Term Mapping to automatically map search words to their respective Medical Subject Headings terms [17] . After retrieving results from each database, we removed duplicate studies using exact PubMed ID match. When PubMed IDs were absent, we reviewed articles, comparing title, author list and publication date to further identify duplicate publications. We filtered the results by excluding non-English studies, conference papers, editorials and notes. The eligibility criteria include AI and maternal health; papers were categorized by pregnancy stage and health concern. No location or publication date restrictions were applied and no unpublished papers were retained. Retrieved articles were inspected by an independent review (L.D.), and in cases where the eligibility of the study was unclear, it was reviewed by a second reviewer (M.R.B.). We excluded studies for which we were not able to gain access to the manuscripts. First, we searched EMBASE, PubMed and SCOPUS for articles on pregnancy and AI. We found 245 from SCOPUS, 181 from EMBASE and 128 from PubMed. We removed duplicate studies. In total, we found that 381 distinct research papers were retrieved on the first query. For the second query, we found additional papers, including 4 from SCOPUS, 4 from EMBASE and 46 from PubMed. After removing duplicates from the new query, we had a result set of 427 distinct papers. Of these 427 distinct research studies, we excluded the 9 non-English studies, 124 conference papers, 4 editorials and 5 notes, resulting in a set of 285 research papers ( Figure 1 ). The next step was to assess the 285 remaining research articles for relevance. We manually reviewed the 285 articles to determine if they met selection criteria: (1) focused on AI, (2) related to pregnancy and (3) related to healthcare and health We grouped the studies by the stage in pregnancy: preconception, assisted reproductive technology (ART), prenatal screening and monitoring, preterm labor, birth and delivery (full term) and postnatal to illustrate the different clinical areas where AI and ML methods were applied. These areas are shown in Figure 2 ; we distinguish between pregnant person and embryo, fetus or neonate as methods are designed explicitly for each respective concern. References to maternal and women's health are reflecting the terms and concepts of research included in this paper. Pregnant people include transgender and gender nonconforming individuals; however, no papers (n = 0 studies) define this population explicitly. CDSSs are explicitly described in each category where they are present. Table 1 presents studies by pregnancy stage, with an overview of the methodology and results. Not included in Table 1 is one evaluation of an application described in an included paper [18] , and 8 review papers, reviewing applications in the following: general obstetrics [19] , translational science [20] , intrapartum surveillance [21, 22] , fetal heart rate (FHR) [23] , labor fetal assessment [24] , ART [12] and mobile health (mHealth) for antepartum care [25] . AI may help address challenges in birth defects research. [20] . Reviews found great promise in AI applications, but not realized in clinical care [21, 22, 25] , or missing external validation [12] . AI applications are warned as adjunct, not replacement, of healthcare professionals [19, 23] . One review found no evidence supporting improved pregnancy outcomes with AI applications [24] . AI methods applied are shown in Figure 3 . Refer to Table 2 for further references into the methodologies observed in this review, along with pregnancy and informatics domains in which these methods were applied. Of the 127 studies, 16 applied AI and ML methods to retrospective EHR data. A total of 12 studies used expert systems as their AI methods (Figure 3 ), and these are often older, more historical AI methods applied to women's health. Studies tended to focus on the prenatal care aspect of the pregnancy (Figure 2) , with more studies focused on fetal aspects of the prenatal care (n = 50 studies) versus the maternal aspects of care (n = 23 studies). Another popular area where AI methods were used was perinatal/birth and delivery (15 studies focused on maternal outcomes and 5 focused on fetal outcomes). Other stages are represented to a lesser extent, especially preconception care and postpartum care with two studies each ( Figure 2 ). Pregnancy care spans across several disciplines, some of which overlap: electronic monitoring, radiology and imaging, mHealth, CDSS, NLP and search analytics, genetic and chromosomal abnormalities, pregnancy complications, labor and delivery, and postpartum care. Electronic monitoring was the most common with 27 studies (Figure 4 ). Radiology and CDSS were also popular areas for using AI methods. Informatics areas that are only beginning to utilize AI methods for pregnancy-related care include mHealth (n = 9), chromosomal anomaly detection (n = 7) and NLP (n = 3). Development of provider-facing and direct-to-consumer mHealth applications supports preconception [26] , gestational disease management [27] , remote health monitoring [28] , low-resource prenatal care [29] [30] [31] , text messaging [32] [33] [34] , patient education [32, 35] , fetal health status prediction [36] , preeclampsia prediction [37] and perinatal depression [34] . A 2019 systematic review determined mHealth applications with promising potential for use by healthcare workers during antenatal care [25] : Babyscripts [32] , OpenSRP [29] , PANDA [28] , PotM [37] , mHealth Guatemala [30] , Expect With Me [35] , mPAMANECH [31] and COMMCARE [38] . A CDSS with mobile patient support predicts normal, pathological and potentially pathological fetal health status; significant features found include fetal age, maternal age, blood serotype, delivery number and illnesses regarding the current pregnancy [36] . Together, boosted decision tree (DT), decision forest (DF) and decision jungle were found to be the most efficient. Direct-to-consumer mHealth applications have potential to increase engagement and empower pregnant people in their healthcare. Three studies arose, focusing on predicting the fertility window [26] , text mining to understand communications with a sexual and reproductive health information service [33] and a mHealth text messaging system for perinatal depression [34] . It should be noted that in these studies the authors declare financial conflict of interest. A text mining approach was developed using naïve Bayes (NB) and basic NLP techniques, to understand how Kenyan men and women communicated with askNivi, a free sexual and reproductive health information service [33] . The users wrote most often about family planning methods, contraception, side effects, pregnancy, menstruation and sex. A majority of users sought factual information, followed by requests for advice and reporting symptoms. A prototype mHealth text messaging system for perinatal depression, Tess, was tested with mothers recruited from public hospitals outside of Nairobi, Kenya [34] . Prediction of a woman's fertile window through data received by a wearable bracelet achieved 90% accuracy using a random forest (RF) classifier [26] . A prospective longitudinal study determined what phase-based differences a wearable bracelet could detect in users' wrist skin temperature, heart rate, heart rate variability, respiratory rate and skin perfusion [26] . Applications address the needs of the pregnant person, support clinical care and often aim to limit in-person visits without compromising quality of care. Post the COVID-19 lockdown, development and application of mHealth and telehealth for (2018) NLP, rule-based, boot-strapping To assess whether rare health-related events (birth defects) are reported on social media, to design an NLP approach for collecting data from social media and to utilize the collected data to discover a cohort of women whose pregnancies with birth defect outcomes could be observed for epidemiological analysis pregnancy care, especially for pregnancy with comorbidities [39, 40] , will be needed to provide adequate care while reducing exposure during prenatal care. A wealth of data is accessible through social media, including health behaviors and health outcomes. NLP techniques were applied to user-generated Twitter data to estimate and track the incidence of pregnancy [41] and collect data on birth defect outcomes [42] . Support vector machine (SVM) was used to infer health events associated with individual users and found pregnancy to be the most commonly identified medical condition [43] . NLP techniques and SVM were applied to automatically classify eligibility criteria in ClinicalTrials.gov to facilitate patient-trial matching [44] . A great tool for public health and pharmacovigilance, NLP can analyze unstructured health data, improve EHR usability, and facilitate interoperability. AI assessment of embryo images or videos has great potential to improve ART outcomes. Applications guide identification of embryos from the culture medium during early human in vitro development [45] , raw time-lapse videos/images of embryos [46] [47] [48] and in vitro fertilization (IVF) EHR data [49, 50] . Evaluation of morphokinetic time-lapse microscopy data depends on the experience and knowledge of embryologiststhis work is highly subjective with a lack of standardization. Applications may reveal details of embryo morphology imperceptible to the human eye or predict successful pregnancy by integrating other relevant health data points. DL can determine robust quantitative imaging biomarkers for embryo selection, improve therapy outcomes and reduce clinical burden. A convolutional neural network (CNN) was implemented to select highest quality embryos using a large collection of human embryo time-lapse images from a high-volume fertility center in the United States [48] . This 22-layer deep model resulted in the trained algorithm called STORK. STORK performed well on additional datasets of embryo images from two other IVF centers. Different grading systems (i.e. unstandardized) across clinics affect performance; STORK demonstrated lower performance for one clinic dataset and therefore lower generalizability. For embryo selection, STORK outperformed individual embryologists in assessing embryo image quality. However, STORK could not predict positive and negative live births using embryo morphology alone. Applications of AI have targeted fetal development: predicting fetal health status [36] , improving fetal brain imaging [51] [52] [53] and other fetal anatomy [54, 55] . Semi-automated learningbased framework approach reported to improve gestational age prediction, with accuracy of ±6.1 days, using DF on structural brain image and clinical data [56] . AI has been applied to improve knowledge and treatment of ectopic pregnancy, using ML classifiers [57] and using gene stability algorithms [58] . Studies aim to improve imaging of fetal organ development with virtual organ computer-aided analysis (VOCAL) [55] , texture analysis [52] and CNN [53, 59] . Three studies observe EHR data; higher performing classification ML approaches used in these studies include DF [36, 56] and SVM [57] . Placental health is deeply connected with maternal-fetal health and plays an essential role in supporting fetal development [60] . Placental function and characteristics have been observed with VOCAL with 3D power Doppler, blood oxygen level-dependent magnetic resonance imaging (BOLD-MRI) [61, 62] , ultrasound images [63] and CNN [64] . Research focuses on improving placental pathology and sufficiency knowledge [62] , placental volume imaging methods [61, 65] and placental maturity identification [63] . BOLD-MRI has gained attention as a promising non-invasive technique to monitor placental function in vivo; however, the use in practice is limited. The prenatal diagnosis rate of major congenital heart defects remains low most likely due to the downfalls of manual navigation during sonographic screening, as it is operator-dependent, challenging and time-consuming. The Fetal Intelligent Navigation Echocardiography (FINE) method provides visualization of standard fetal electrocardiogram (fECG) views from volume datasets obtained with spatiotemporal image correlation, aiming to consistently display diagnostic planes, regardless of fetal position or initial orientation [54] . The Virtual Intelligent Sonographer Assistant tool was developed to visualize specific structures in the complex anatomy of the fetal heart. When applied to four abnormal cases, the FINE method demonstrated evidence of abnormal fetal cardiac anatomy in multiple fECG views. Definitive prenatal diagnosis of genetic disease is performed via invasive procedures, specifically amniocentesis or chorionic villus sampling. Risks include miscarriage, fetal morbidity, parental anxiety and Rh sensitization [66] . Non-invasive prenatal tests (NIPTs) hold no such risks and usually tests only for trisomy 21 (T21). Trisomies 18 (T18) and 13 (T13) are the second and third most common trisomies after T21. Expansion of NIPT could include screening for T18 and T13 [67, 68] and other chromosomal abnormalities (OCA) [69, 70] . A number of ML techniques were applied to improve chromosomal screening, including artificial neural network (ANN) [69, 70] ; SVM [70, 71] ; k-nearest neighbor [70, 71] and deep NN, RF, NB, DT and logistic regression (LR) [71] . Studies comparing methods found NN methods to perform best [70, 71] . AI methods were applied to optimize detection of relatively small mutations at a low sequencing cost [72] . In order to build a 'metabolic fingerprint' resulting from congenital anomalies of the central nervous system, ensemble learning was applied to characterize maternal serum [73] . During pregnancy, the pregnant person and the fetalplacental unit produce cell-free DNA (cfDNA). Using next generation sequencing, a SVM model demonstrated high accuracy for cfDNA testing [74] . Genome-wide cfDNA analysis detected OCA with high sensitivity, in comparison with standard cfDNA screening [75] . A proposed two-stage routine procedure, with combined tests, invasive tests and cfDNA tests, achieved high detection rate for T21 cases and proved to be minimally invasive and of relatively low cost [76] . Nuchal translucency (NT) measurement, a well-established ultrasonographic marker for fetal aneuploidy screening, can vary greatly and as a result impair screening performance. The performance-adjusted risk (PAR) method allows for these differences in measurements and improves performance [77] . PAR analyzes individual provider and laboratory marker distribution parameters, compares with national expectations and then assigns a handicap. NT and serum markers are considered in the method, using commercially available software and knowledge from meta-analysis of published literature. The PAR method informs providers of their handicap and their performance in relationship to others. The FHR signal is more complicated to identify than the adult signal, and therefore, there has been an effort to improve methods to detect, sample and quantify the FHR signal accurately. Computer analysis of the FHR is categorized into three stages: (1) raw signal processing, (2) pattern analysis and (3) expert/intelligent systems [23] . Early computer programs only identified deceleration patterns and outcomes of normal, warning or ominous patterns. Studies apply AI methods to ECG [78] [79] [80] [81] [82] [83] [84] , magnetocardiography (MCG) [85, 86] and cardiotocogram (CTG) [87] [88] [89] [90] . In attempt to improve fetal signal extraction from ECG and MCG recordings, several supervised learning methodologies were proposed: adaptive neuro-fuzzy inference (ANFIS) [79, 84] and time-frequency analysis [81] . Unsupervised learning techniques were applied: independent component analysis [78, 83, 85, 86] , principle component analysis [46, 91] and Kalman filtering [82] . Moreover, studies applied a variety of ML methods to classify CTG signals and determine fetal state, with best performances demonstrated using SVM [87, 89] , DL [90] and RF [88] . A portable prototype expert system detected modeled fetal arrhythmias with 88% accuracy and could be applied to scenarios where there is an insufficient number of experts and a large number of patients [84] . The prototype uses ANFIS to extract fECG from thoracic and abdominal signals, extracts the FHR and then detects fetal arrhythmia through an expert system based on production rules. Accurate prediction methods and diagnosis during prenatal care allow health professionals and prospective parents detect problems with the pregnancy as early as possible. Some studies investigate patient comorbidities including gestational diabetes mellitus [18, 27, [92] [93] [94] [95] [96] [97] , gestational hypertension disorders [98] [99] [100] and bacteriuria [101] . Studies aim to predict and classify disease in early pregnancy, improve screenings and provide clinical decision support for disease management. Methods proposed include rule-based [27, 97] , Bayesian networks [18, 94, 95] , ANN [92, 98, 101] , evolutionary radial basis function network [93] , genetic algorithm [99] , expert systems [94, 96] and DT [96] . Concerning preterm birth, studies focus on cervix-related risk [102, 103] , classifying true preterm labor [104, 105] , determining neonate mortality and prognosis [106, 107] , predicting preterm birth risk [108] [109] [110] [111] , professional learning [112] , estimating postnatal gestational age [113] and improving knowledge of gene regulatory elements in the placenta [114] . In 1982, Grignolio [106] presented a method to predict neonate mortality of premature newborns, using multiple regression. In 1990, Andersen et al. [115] found that shorter cervical length (CL) was associated with a high risk of preterm delivery. Woolery [108, 116] developed a clinical knowledge base for preterm birth risk assessment and developed an expert system for preterm birth risk assessment of pregnant women. For pregnancy and labor contraction classification, studies found that AI methods improve diagnosis from EHG records, including polynomial classifiers [104] , feature selection using binary particle swarm optimization with quadratic discriminant analysis [105] and a ridge extraction method [117] . Cervix properties were observed using AI methods to determine material properties [103] and CL [115] . Perinatal outcome was predicted with a DL model interpreting amniotic fluid metabolomics and proteomics in asymptomatic pregnant women with short CL [109] . SVM was applied to predict genomewide placental enhancers, in order to further the understanding of placental dysfunction and implications of preterm birth and preeclampsia [114] . Two studies turn to EHR data to determine preterm birth risk, using ANN, LR, RF [110] and recurrent neural networks [111] . Both studies found associations with preterm birth risk and hypertensive disorders, as well as CL. Ectopic pregnancy can be potentially life-threatening, and therefore, early diagnosis and treatment of the condition are needed. A three-stage classifier (3SC) developed finds and limits diagnostic errors and then assists clinicians in choosing the initial treatment of ectopic pregnancy [57] . The model was evaluated across four algorithms: SVM, NB, DL and auto multilayer perceptron (MLP). 3SC achieved the best performance, in comparison with single-stage classifiers. At the event of fetal-maternal hemorrhage, the standard clinical method to quantify fetal and maternal red blood cells (RBCs) is the Kleihauer-Betke test, which is performed by a certified technologist. The automated system can count over 60 000 RBCs within 5 minutes, versus a technologist counting ∼2000 in about 15 minutes [118] . The Apgar score [119, 120] evaluates the condition of newborns in their first minutes of neonatal life. However, the score is not intended as a prognostic value; it has limitations and is often highly subjective, depending largely on the clinical experience of the evaluator. The prognostic value of the Expanded Apgar Score Form [121] was studied, observing the early death incidence in preterm newborns [107] . The chi-square automatic interaction detection classification tree generated decision rules based upon clinical data. The concentration of the oxygen applied during resuscitation was found to be an important criterion to neonatal outcome. An open-source simulator, called NICeSim, was developed to aid health professionals to better understand the treatment and prognosis of premature newborns admitted to the neonatal intensive care unit [112] . The system allows for flexibility and provides alterations of the chosen variables. The attributes used to calculate death risk include Apgar scale, respiratory distress syndrome, gestational age and birth weight. Sufficient predictive power was demonstrated; ANN achieved better accuracy and specificity, whereas SVM performed better sensitivity. Important attributes are most likely absent as the ML model uses four attributes for a complex problem. Clinicians and prospective parents want to be informed about the well-being of the unborn infant. AI applications observe infant growth and health status from imaging and maternal EHR data history. ML methods were applied to predict babies' birth weight, using LR [122, 123] , MLP [123] and fuzzy logic support vector regression [124] , RF [122] , Bayesian models [122, 125] and generalized boosted model [122] . The studies predict weight using EHR data [123, 124, 126] , amniotic fluid [125] , ultrasound images [122] and CTG traces [127] . When extracting the FHR from an ECG recording, uterine contractions during labor can introduce significant noise. In order to improve perinatal knowledge and outcomes, AI has been applied to estimating date of delivery [115] , route of delivery [128] [129] [130] [131] , fetal health prognosis [132] and length of labor [133] . Methods were been developed to provide perinatal knowledge [134] , improve fetal monitoring and assessment [87-89, 117, 135-139] and classify true labor contractions [140] . A single paper employed unsupervised ML, using k-means clustering [139] . A 2015 systematic review found no strong evidence that the use of CTG with an expert system has an effect on the incidence of cesarean delivery nor a reduction in the incidence of forcepsassisted vaginal birth [24] . Solely based upon FHR traces, models were developed to improve classifying route of delivery [129, 131] and classifying pregnancy and labor contractions [105] . Performance varied between ML techniques and number of features included in models; using 13 features, the DL classifier demonstrated the best performance, while selecting 8 features, the RF classifier performed best [129] . An ensemble classifier of three ML techniques, FLDA, RF and SVM classifiers, allows the strength of each model to be used to classify between cesarean section and vaginal delivery types [131] . Models take advantage of available EHR data in conjunction with labor monitoring data. Another study found that an aggregation double-layer SVM model using contextualized EHG parameters performed best to distinguish patients that will achieve spontaneous labor before the end of their full term from those that will require late-term induction of labor based upon EHG signals and obstetrical parameters [128] . Obstetrical parameters include maternal age, BMI, gestations, parity, Bishop score and days of gestation at recording moment. The Adana System applied ANN to classify between cesarean section and vaginal delivery, including input variables of maternal characteristics and labor information [130] . Predicting route of delivery can inform care, allow appropriate allocation of resources and improve pregnancy outcomes. During the postpartum period, many mothers try to cope with physical, social and psychological changes [141, 142] . Postpartum depression (PPD) affects an estimated 13-19% of women who have recently given birth [143] . Wang et al. [144] applied several ML models to predict PPD using EHRs and found several associated risk factors: race, obesity, anxiety, depression, antidepressants and anti-inflammatory drugs during pregnancy and different types of pain [144] . The potential of AI for pregnancy care exists but not without its barriers and potential pitfalls. Challenges include dataset shift, discriminatory bias, generalizability, poor clinical applicability, accidental fitting of cofounders and unintended negative consequences on health outcomes [145] . For instance, no papers were retrieved concerning transgender and gender non-conforming pregnancy. We also performed another literature search and found that there were no transgender-related AI studies on pregnancy. Many transgender men and gender non-conforming individuals (assigned female at birth) retain the capacity to become pregnant, use contraception, desire to become pregnant and give birth, and there are some case studies on this process [146, 147] . Little research focuses on the reproductive needs of this population [146, [148] [149] [150] and stigma can lead individuals to avoid seeking medical care or disclosing medically relevant health information [151] . Transgender individuals are likely to have unique needs surrounding pregnancy health and psychosocial health (e.g. gender dysphoria, lactation following chest surgery or binding, hormone replacement therapy) [151] . It would not be ethical to employ offthe-shelf AI methods designed for gender-conforming populations to those that were transgender or gender non-conforming populations. It is unrealistic to assume that AI applications in health are ethically neutral. AI technology has the capacity to violate the basic rights of individuals, such as learning patterns from past clinical care that might increase discrimination (e.g. structural racism), violate privacy and inadvertently create inequalities in care. Unintentional discrimination (e.g. vulnerable populations, sexism and racism) need to be proactively accounted for in algorithms [152, 153] , especially given that algorithms utilize data collected in the past that may be unintentionally biased in various ways. For patients and healthcare providers, individuals need a good understanding of the algorithm's decision-making in order to have agency over the derived clinical decisions. In congruence with current best practices, AI needs to provide room for healthcare providers to influence the decision-making process. The patterns underlying AI decisions should also be made transparent to researchers for them to understand the potential biases underlying these algorithms. We used 'pregnancy' as one of the search terms for identifying relevant papers. Therefore, preconception and postpartum studies were only included if they also discussed a pregnancy element. This may have lowered the overall number of studies focused on preconception and postpartum stages of pregnancy included in our literature review. Preconception care and postpartum care encompass healthcare beyond the scope of this particular review, and therefore, this study does not represent AI and ML applications in the entirety of these fields. Studies included in our review do not explicitly report the gender of their patients; therefore, it is unclear how many transgender patients may have been included in the research studies that we cite. We found that AI and ML methodologies are used to help inform pregnancy outcomes. Of which 16 took advantage of EHR data to derive sound and reliable information about maternal and fetal health-illustrating the need to increase research using EHR data. Applications of mHealth were used in nine studies in the following areas: preconception [26, 33] , during gestation [27, 36, 93, 96, 97, 154] and postpartum [34] . Unsupervised methods were limited to automated EHR and contraction monitoring [78, 82, 83, 85, 86, 139] , imaging [46] , genetics [91] and trisomy screening [69] applications. Research aimed to detect anomalies and diagnoses earlier, inform practice and support patients, improve fetal and maternal health knowledge, deep phenotype obstetric complications and automate aspects of current practice. Past reviews of AI applications in pregnancy research and healthcare have described great potential but little realization in the clinical setting. While the papers in our review did not describe adoption explicitly in the clinical setting, the variety of CDSS (n = 24) and mHealth applications (n = 9) in our review illustrate the effort to translate AI into clinical care. Low adoption of AI in clinical care in the pregnancy domain may likely be due to liability questions in case of medical error that remain unresolved-especially for so-called black-box algorithms. More research on maternal healthcare is needed, with only 31% of prenatal stage papers focused on maternal needs. In addition, future research should proactively account for unintentional discrimination in algorithms. Potential pitfalls need to be accounted for in development of AI algorithms, in order to prevent adverse events and negative clinical impacts. Greater use of unsupervised methods may be appropriate with only nine studies using this approach but could be limited by the interpretability of their results. DL has begun to gain traction in recent years [10] -unsupervised ML and DL naturally apply well to imaging analysis and large datasets in biomedicine. The amalgamation of supervised and unsupervised learning (i.e. semi-supervised ML) could be valuable, as labeling data for supervised ML methods is time-consuming and expensive. AI and ML methods can be successfully employed to optimize pregnancy outcomes; with proper algorithmic improvement, refinement and ethics results, these methods could be incorporated into clinical care. • Artificial intelligence and machine learning methods can be successfully employed to optimize pregnancy outcomes. • Supervised methods are the most common approach, and deep learning methods are gaining traction. • Future research should focus on less-studied areas: maternal healthcare needs, postpartum care, pregnancy care for transgender individuals and in algorithmic improvement, refinement and translation of results back into clinical care. Supplementary data are available online at Briefings in Bioinformatics. This is a review paper and the primary data is the analysis of the papers themselves. This analysis and the main result of our work is included in Table 1 for all to access. We have also included a Prisma checklist as a supplementary document. Any additional data requests could be sent to the corresponding author: bolandm@upenn.edu This work was supported by the Perelman School of Medicine at the University of Pennsylvania with generous funds to support this project. Applications of artificial neural networks in health care organizational decisionmaking: a scoping review Survey of machine learning algorithms for disease diagnostic Deep phenotyping for precision medicine Deep phenotyping: embracing complexity and temporality-towards scalability, portability, and interoperability Preterm labor and preterm birth Etiopathogenesis, prediction, and prevention of preeclampsia New insights into mechanisms behind miscarriage Machine learning in medicine eDoctor: machine learning and the future of medicine Deep learning for healthcare: review, opportunities and challenges Contents IMIA yearbook of medical informatics 2019 A review of machine learning approaches in assisted reproductive technologies Use of artificial intelligence (AI) in the interpretation of intrapartum fetal heart rate (FHR) tracings: a systematic review and meta-analysis Artificial intelligence: a new paradigm in obstetrics and Gynecology research and clinical practice The PRISMA statement for reporting systematic reviews and metaanalyses of studies that evaluate health care interventions: explanation and elaboration Exploring PubMed as a reliable resource for scholarly communications services Medical Informatics in a United and Healthy Europe: Proceedings of MIE 2009, the XXII International Congress of the European Federation for Medical Informatics Evaluation of DIABNET, a decision support system for therapy planning in gestational diabetes Computer applications in obstetrics Integrative database management for mouse development: systems and concepts Intelligent fetal heart rate computer systems in intrapartum surveillance Future perspectives in intrapartum fetal surveillance Computer analysis of the fetal heart rate Expert systems for fetal assessment in labour Mobile technology in health (mHealth) and antenatal care-searching for apps and available solutions: a systematic review Wearable sensors reveal menses-driven changes in physiology and enable prediction of the fertile window: observational study Gestational diabetes management using smart mobile telemedicine Usability and feasibility of a mobile health system to provide comprehensive antenatal care in low-income countries: PANDA mHealth pilot study in Madagascar World Health Organization. OpenSRP|Open Smart Register Platform An mHealth monitoring system for traditional birth attendant-led antenatal risk assessment in rural Guatemala The role of a decision-support smartphone application in enhancing community health volunteers' effectiveness to improve maternal and newborn outcomes in Nairobi, Kenya: quasiexperimental research protocol Testing the feasibility of remote patient monitoring in prenatal care using a mobile app and connected devices: a prospective observational trial What is the best method of family planning for me?': a text mining analysis of messages between users and agents of a digital health service in Kenya Expanding access to depression treatment in Kenya through automated psychological support: protocol for a singlecase experimental design pilot study Expect with me: development and evaluation design for an innovative model of group prenatal care to improve perinatal outcomes Fetal health status prediction based on maternal clinical history using machine learning techniques Usability and feasibility of PIERS on the move: an mHealth app for pre-eclampsia triage Continuum of care services for maternal and child health using mobile technology -a health system strengthening strategy in low and middle income countries Managing diabetes in pregnancy before, during, and after COVID-19 Telehealth for high-risk pregnancies in the setting of the COVID-19 pandemic Twitter: a good place to detect health conditions Social media mining for birth defects research: a rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter Automatic identification of web-based risk markers for health events Automated classification of eligibility criteria in clinical trials to facilitate patient-trial matching for specific patient populations The search for biomarkers of human embryo developmental potential in IVF: a comprehensive proteomic approach How much information about embryo implantation potential is included in morphokinetic data? A prediction model based on artificial neural networks and principal component analysis Deep learning as a predictive tool for fetal heart pregnancy following time-lapse incubation and blastocyst transfer Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization Bayesian classification for the selection of in vitro human embryos using morphological and clinical data Predictive modeling of implantation outcome in an in vitro fertilization setting: an application of machine learning methods Artificial intelligence assistance for fetal head biometry: assessment of automated measurement software Parameter set for computer-assisted texture analysis of fetal brain Realtime deep pose estimation with geodesic loss for imageto-template rigid registration Fetal intelligent navigation echocardiography (FINE): a novel method for rapid, simple, and automatic examination of the fetal heart Fetal thymus volume estimation by virtual organ computer-aided analysis in normal pregnancies Learning-based prediction of gestational age from ultrasound images of the fetal brain A decision support system for predicting the treatment of ectopic pregnancies Variation in stability of endogenous reference genes in fallopian tubes and endometrium from healthy and ectopic pregnant women 3-D reconstruction in canonical co-ordinate space from arbitrarily oriented 2-D images The placenta: a multifaceted, transient organ Spatiotemporal alignment of in utero BOLD-MRI series Predicting congenital heart defects: a comparison of three data mining methods Discriminative learning for automatic staging of placental maturity via multi-layer fisher vector Fully automated, real-time 3D ultrasound segmentation to estimate first trimester placental volume using deep learning Selection of the subnoise gain level for acquisition of VOCAL data sets: a reliability study Chorionic villus sampling and amniocentesis: recommendations for prenatal counseling Screening for trisomies 21, 18 and 13 by maternal age, fetal nuchal translucency, fetal heart rate, free β-hCG and pregnancyassociated plasma protein-a Screening and outcome of chromosomal abnormalities other than trisomy 21 in northern Finland Intelligent noninvasive diagnosis of aneuploidy: raw values and highly imbalanced dataset First trimester noninvasive prenatal diagnosis: a computational intelligence approach Evaluation of machine learning algorithms for improved risk assessment for Down's syndrome High resolution noninvasive detection of a fetal microdeletion using the GCREM algorithm A metabolomics-based approach for non-invasive screening of fetal central nervous system anomalies Improving the calling of noninvasive prenatal testing on 13−/18−/21-trisomy by support vector machine discrimination The clinical utility of genome-wide non-invasive prenatal screening Two-stage approach for risk estimation of fetal trisomy 21 and other aneuploidies using computational intelligence systems Performance adjusted risks: a method to improve the quality of algorithm performance while allowing all to play A resampling approach to estimate the stability of one-dimensional or multidimensional independent components Extraction of Fetal electrocardiogram using adaptive neuro-fuzzy inference systems CAD for detection of fetal electrocardiogram by using wavelets and neuro-fuzzy systems An automated methodology for Fetal heart rate extraction from the abdominal electrocardiogram Fetal QRS extraction from abdominal recordings via model-based signal processing and intelligent signal merging An efficient unsupervised fetal QRS complex detection from abdominal maternal ECG A portable prototype for diagnosing fetal arrhythmia. Unlocked: Informatics Med A method for the automatic reconstruction of fetal cardiac signals from magnetocardiographic recordings Entropy-based automated classification of independent components separated from fMCG Determination of fetal state from cardiotocogram using LS-SVM with particle swarm optimization and binary decision tree Classification of the cardiotocogram data for anticipation of fetal risks using machine learning techniques Fuzzy analysis of delivery outcome attributes for improving the automated Fetal state assessment Cardiotocographic diagnosis of fetal health based on multiclass morphologic pattern predictions using deep learning classification Exon level machine learning analyses elucidate novel candidate miRNA targets in an avian model of fetal alcohol spectrum disorder Artificial intelligence technology as a tool for initial GDM screening Evolutionary radial basis function network for gestational diabetes data analytics Heterogeneous methodology to support the early diagnosis of gestational diabetes DIABNET: a qualitative model-based advisory system for therapy planning in gestational diabetes A web-based clinical decision support system for gestational diabetes: automatic diet prescription and detection of insulin needs Assessment of a personalized and distributed patient guidance system Artificial neural network for normal, hypertensive, and preeclamptic pregnancy classification using maternal heart rate variability indexes Integrating multiple 'omics' analyses identifies serological protein biomarkers for preeclampsia Neurofuzzy model for HELLP syndrome prediction in mobile cloud computing environments Using artificial intelligence to reduce diagnostic workload without compromising detection of urinary tract infections A novel algorithm for computerassisted measurement of cervical length from transvaginal ultrasound images Anisotropic material characterization of human cervix tissue based on indentation and inverse finite element analysis Prediction of preterm deliveries from EHG signals using machine learning Comparison of different EHG feature selection methods for the detection of preterm labor Medical diagnoses by artificial intelligence process Practical application and prognostic value of the expanded Apgar score Machine learning for an expert system to predict preterm birth risk Artificial intelligence and amniotic fluid multiomics: prediction of perinatal outcome in asymptomatic women with short cervix Artificial neural network analysis of spontaneous preterm labor and birth and its major determinants Deep learning predicts extreme preterm birth from electronic health records NICeSim: an open-source simulator based on machine learning techniques to support medical research on prenatal and perinatal care decision making Postnatal gestational age estimation of newborns using small sample deep learning Genome-wide maps of distal gene regulatory enhancers active in the human placenta Prediction of risk for preterm delivery by ultrasonographic measurement of cervical length Clinical knowledge base development for preterm-birth risk assessment Ridge extraction from the time-frequency representation (TFR) of signals based on an image processing approach: application to the analysis of uterine electromyogram AR TFR A system for counting fetal and maternal red blood cells A proposal for a new method of evaluation of the newborn Evaluation of the newborn infant-second report The Apgar score Machine learning for fetal growth prediction Prediction methods for babies' birth weight using linear and nonlinear regression analysis Fetal weight estimation using the evolutionary fuzzy support vector regression for low-birth-weight fetuses Early prediction of macrosomia based on an analysis of second trimester amniotic fluid by capillary electrophoresis Prediction of fetal weight at varying gestational age in the absence of ultrasound examination using ensemble learning Integrating machine learning techniques and physiology based heart rate features for antepartum fetal monitoring Prediction of labor onset type: spontaneous vs induced; role of electrohysterography? Classification of caesarean section and normal vaginal deliveries using foetal heart rate signals and advanced machine learning algorithms Computerized prediction system for the route of delivery (vaginal birth versus cesarean section) Machine learning ensemble modelling to classify caesarean section and vaginal delivery types using cardiotocography traces A perinatal monitoring display based on the fetal topogram Predicting the duration of the first stage of spontaneous labor using a neural network A prototype system for perinatal knowledge engineering using an artificial intelligence tool Preliminary evaluation of an intelligent system for the management of labour An automated intelligent diagnostic system for the interpretation of umbilical artery Doppler velocimetry A computerized diagnostic system for the interpretation of umbilical artery blood flow velocity waveforms Fetal ECG extraction during labor using an adaptive maternal beat subtraction technique Detection of uterine MMG contractions using a multiple change point estimator and the K-means cluster algorithm Discriminating pregnancy and labour in electrohysterogram by sample entropy and support vector machine Contemporary women's adaptation to motherhood: the first 3 to 6 weeks postpartum The relationship between maternal self-confidence and postpartum depression in primipara mothers: a follow-up study Postpartum depression: current status and future directions Using electronic health records and machine learning to predict postpartum depression Key challenges for delivering clinical impact with artificial intelligence Transgender men who experienced pregnancy after female-to-male gender transitioning Family planning and contraception use in transgender men Transmasculine individuals' experiences with lactation, chestfeeding, and gender identity: a qualitative study Transgender men and pregnancy Fertility preservation options for transgender individuals From erasure to opportunity: a qualitative study of the experiences of transgender men around pregnancy and recommendations for providers The ethical algorithm: the science of socially aware algorithm design There is no such thing as race in health-care algorithms Averaged one-dependence estimators on edge devices for smart pregnancy data analysis SVMs -a practical consequence of learning theory Overview of artificial neural network models in the biomedical domain Multilayer perceptron, fuzzy sets, and classification Radial basis function neural networks: a topical state-of-the-art survey Calibrating random forests for probability estimation Points of significance: Bayes' theorem A fuzzy k-nearest neighbor algorithm Regularized discriminant analysis An Introduction to Genetic Algorithms Opportunities and obstacles for deep learning in biology and medicine Principal component analysis Independent component analysis: an introduction Data clustering: 50 years beyond k-means Authors' contribution L.M.D. and M.R.B. conceived study design; wrote paper and reviewed, edited and approved final manuscript.