key: cord-1054263-46o84oci authors: Qian, Feng; Zhang, Andrew title: The Value of Federated Learning During and Post COVID-19 date: 2021-02-04 journal: Int J Qual Health Care DOI: 10.1093/intqhc/mzab010 sha: 8ab4e43bb045041a999030d6e9afcacdc11ce0f8 doc_id: 1054263 cord_uid: 46o84oci nan Federated learning (FL) as a distributed machine learning (ML) technique has lately attracted increasing attention of healthcare stakeholders.(1-3) As privacy-sensitive personal data explode from silos of healthcare delivery, FL is perceived as a promising decentralized approach to address data privacy and security concerns by storing and maintaining the privacy-sensitive data locally while allowing multiple institutions to train ML models collaboratively. The term "federated learning" was coined by McMahan et al in their seminal paper.(4) By definition, FL is a ML technique that trains a shared global algorithm across multiple decentralized sites where local data samples originate, which is different from traditional centralized ML techniques where all the local datasets are uploaded to a central server in addition to the assumption of local data samples being identically distributed. Subsequent research has significantly refined FL which shows great potential to disrupt healthcare industry as evidenced by recent applications in COVID-19 and non-COVID-19 scenarios. Despite the great promise, FL is a relatively new concept for physicians, patients, payers, health researchers, and healthcare regulators. While the world is struggling to control the unprecedented COVID-19 pandemic, it is important to know whether and how the FL can provide valuable care during and post-COVID-19. We aim to describe real-world use cases using the FL in both COVID-19 and non-COVID-19 scenarios (Table 1) . FL is a "learning technique that allows users to collectively reap the benefits of shared models trained from this rich data, without the need to centrally store it" and can remarkably reduce data privacy and security risks.(4) And, FL-trained models perform far better than ML models trained using only isolated data bank and that FL-trained models' performance is as good as the performance of ML models trained on centrally hosted datasets. (1, 3) It is worth mentioning that training ML models using centralized datasets confronts enormous challenges in healthcare applications due to stringent legal and administrative regulations, technical difficulties, and patient data privacy concerns. FL's value in healthcare is definitely beyond COVID-19 care. The pandemic has massively disrupted non-COVID-19 multi-center clinical studies. How to utilize already-existing data generated from each participating institution presents an ever-greater challenge. Notably, decentralized clinical research has been lately advocated to include more traditionally underrepresented study subgroups and underserved areas. Such decentralized studies will benefit greatly if FL is incorporated in the study design and data analysis to evaluate quality of care and outcomes like predicting mortality, complications, hospitalizations, and adverse drug events.(7-9) Another promising area for FL is digital health.(1) The pandemic has catalyzed the digital transformation as evidenced by the fact that the use of telehealth, mobile health, wearable devices and patient remote monitoring have skyrocketed. A proper execution of FL will develop more generalizable models which will help achieve equitable, effective, and patientcentered care. Yet, FL has limitations and faces challenges including data leakage, inhomogeneous data distribution, algorithm training optimality, privacy vs. performance trade-off, traceability, and integrity assurance standards.(1, 3) FL's local implementation need to address practical issues of steep learning curve, communication cost, systems and statistical heterogeneity, and talent acquisition and retention. Continuous advancements of FL such as novel models of asynchrony, heterogeneity diagnostics, and granular privacy constraints will hopefully overcome these challenges as AI, ML, edge computing, and Internet of Things (IOT) coupled with explosive growth of healthcare data are rapidly creating innovative solutions. We believe that FL can play a significant role in fighting against the ongoing COVID-19 pandemic and accelerating datadriven precision medicine. (Word Count: 777) The future of digital health with federated learning Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data Federated Learning for Communication-Efficient Learning of Deep Networks from Decentralized Data A collaborative online AI engine for CT-based COVID-19 diagnosis. medRxiv Electronic Health Records Improves Mortality Prediction in Patients Hospitalized with COVID-19. medRxiv Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records Predicting Adverse Drug Reactions on Distributed Health Data using Federated Learning