key: cord-0793179-soc3iapa authors: McCudden, Christopher R title: Deus Ex Machina? Predicting SARS-CoV-2 Infection from Lab Tests Using Machine Learning date: 2020-10-28 journal: Clin Chem DOI: 10.1093/clinchem/hvaa212 sha: 40dec7d8b6202cc67225ca8962bed6935a5672ed doc_id: 793179 cord_uid: soc3iapa nan Coronavirus 2019 (COVID-19) has disrupted lives, the economy, and healthcare systems across the globe, unlike any infectious disease in 100 years. As we collectively seek to survive and emerge from this ongoing crisis, it is worth evaluating any scientific discovery that may help reduce health risks or address barriers to the response. One particular barrier that will persist through the pandemic is the speed and availability of COVID-19 diagnostic testing. Test availability continues to be impeded by global supply chain shortages and logistic challenges, which have often caused long turnaround times and delayed results. This problem is partially addressed in the study by Yang and coworkers (1), who aim to predict SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) infection before COVID-19 reverse transcription-PCR results are available by combining routinely available laboratory results with modern machine learning methods. Applying machine learning to lab results can be useful when the relationship between individual analytes and disease state is complex or unknown, as is the case with COVID-19. The machine learning algorithms proposed by Yang and coworkers could be useful in the absence of definitive test results to help guide patientmanagement decisions. The study has several notable strengths including the apparent generalizability of the prediction, the ability to improve the algorithm as additional cases are added, and the use of widely available lab tests. In terms of available data, routine complete blood count, coagulation, electrolyte, and kidney and liver function tests are the most commonly ordered laboratory tests, providing a ready source of predictors without the need to order additional tests. Because machine learning algorithms are highly amenable to retraining, continuously adding more classified data (patients with known COVID-19 status) should improve the overall performance. Indeed, the performance of machine learning algorithms generally improves with larger data sets (2) . Cross-validation across 2 different hospitals with different instrumentation demonstrates that the algorithm has the potential to be used widely. In a real-world setting, implementing predictive algorithms for COVID-19 presents several challenges including (a) integration into electronic medical records (EMRs), (b) reporting of predictions, and (c) the inherent opacity of machine learning algorithms. Underpinning these challenges are the key questions of what a given prediction indicates and what action a physician can take with an individual patient. EMRs, laboratory information systems, and middleware systems can be programmed to perform a wide array of calculations; however, the use of machine learning methods requires integration between the highly specialized software that generates the predictions and the laboratory information system or EMR. In the current situation, this would entail a continuous exchange of laboratory data and predictions between the programming language Python with the scikit-learn machine learning library and the laboratory information system. Although integrated and embedded machine learning in EMRs is often touted as the next great advance in medicine, it is currently neither common nor trivial to implement-perhaps this is another technology adoption that will be driven faster by COVID-19. Last, related to the use of predictions and EMR integration, is how to report the probabilities generated by the algorithms. For example, should predictions be provided as a probability score, a risk-related keyword (low, medium, high), a binary measure (detected or undetected), or a textual report explaining the results and algorithm? Overall, adoption is likely to depend on how easy it is to convey what the prediction can and cannot provide. Regardless of the challenges, the incredible strain of COVID-19 on healthcare systems necessitates new approaches to diagnostics, patient management, and data use. With that context, the algorithms presented by Yang and coworkers have the potential to augment more conventional methods for rapid assessment of patients with COVID-19. Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 4 requirements: (a) significant contributions to the conception and design, Routine laboratory blood tests predict SARS-CoV-2 infection using machine learning The unreasonable effectiveness of data Authors' Disclosures or Potential Conflicts of Interest: No authors declared any potential conflicts of interest.