key: cord-0724048-3466el8a authors: Helseth, Donald L.; Gulukota, Kamalakar; Miller, Nicholas; Yang, Mathew; Werth, Tom; Sabatini, Linda M.; Bouma, Mike; Dunnenberger, Henry M.; Wake, Dyson T.; Hulick, Peter J.; Kaul, Karen L.; Khandekar, Janaradan D. title: Flype: Software for enabling personalized medicine date: 2020-12-03 journal: Am J Med Genet C Semin Med Genet DOI: 10.1002/ajmg.c.31867 sha: b547ba5cf3977a2bfcd604be2991944aff21e53a doc_id: 724048 cord_uid: 3466el8a The advent of next generation DNA sequencing (NGS) has revolutionized clinical medicine by enabling wide‐spread testing for genomic anomalies and polymorphisms. With that explosion in testing, however, come several informatics challenges including managing large amounts of data, interpreting the results and providing clinical decision support. We present Flype, a web‐based bioinformatics platform built by a small group of bioinformaticians working in a community hospital setting, to address these challenges by allowing us to: (a) securely accept data from a variety of sources, (b) send orders to a variety of destinations, (c) perform secondary analysis and annotation of NGS data, (d) provide a central repository for all genomic variants, (e) assist with tertiary analysis and clinical interpretation, (f) send signed out data to our EHR as both PDF and discrete data elements, (g) allow population frequency analysis and (h) update variant annotation when literature knowledge evolves. We discuss the multiple use cases Flype supports such as (a) in‐house NGS tests, (b) in‐house pharmacogenomics (PGX) tests, (c) dramatic scale‐up of genomic testing using an external lab, (d) consumer genomics using two external partners, and (e) a variety of reporting tools. The source code for Flype is available upon request to the authors. Both NGS and this computational technology could be deployed in a hospital through a combination of in-house development and outsourcing to a vendor. Outsourcing NGS usually takes the form of contracting with an external lab to sequence hospital specimens. Given the rapid pace of technology development in the NGS space, external labs contracted to do this work are constantly changing making it a challenge to fully integrate with clinical care for reasons explained below. Primarily for this difficulty of integration, external lab reports are frequently faxed back to the hospital and then scanned into the electronic health records (EHR) as static images rather than as usable discrete data elements. Even when not fax-based, it is still very far away from full integration with the EHR and providing appropriate clinical decision support. When in-house NGS development is thrown into the mix, as is often the case, a data management challenge ensues due to the increased variety of results that must now be incorporated back into the EHR. To solve this integration challenge the informatics platform must, at a minimum, satisfy the eight requirements summarized in Table 1 . Implementing any rapidly evolving technology like genomics into clinical practice has long been a challenge due to financial and organizational concerns. Organizations must also solve financial challenges related to reimbursement for these novel tests. Reimbursement for these tests will become common in the future when these tests become routine; until then, it is useful to have patient advocates and other dedicated personnel to work with payers to get reimbursement. The costs for bioinformatics and additional IT infrastructure are fixed and cannot be assigned to any single test; they are usually amortized over a large number of tests in multiple NGS assays. Healthcare organizations usually prefer to have long-term stable relationships with vendors but implementing genomics necessitates flexibility, which specifically means pursuing short-term contracts. Implementing new genomic testing also requires a robust education environment for practitioners and nurses to ensure they use and interpret the tests appropriately. Finally, patient data safety and integrity are of primary concern for IT in every hospital and those concerns must be addressed even as large amounts of genomic data are analyzed and processed. Given the volume of data, there is an active debate in the community (Zandesh, Ghazisaeedi, Devarakonda, & Haghighi, 2019) regarding the pros and cons of storing and analyzing this data in the Cloud as opposed to on-premises. These issues must be dealt with at an institutional level, and what is appropriate for each organization depends on the current status of their technology. When our personalized medicine program began, the priority needs were variant annotation and repository (items 3 and 4, Table 1 ). However, as we attempted to fulfill these needs with a commercial software, it quickly became apparent that the other needs mentioned in Table 1 are equally important for personalized medicine to truly advance at our institution. For example, the commercial product we deployed adequately identified variants, but it was unable to transmit that information to the EHR. It also failed to retain interpretation information on variants from one sample to the next (when the same variant might be seen again). Further, it had limited ability to connect to ever-changing external knowledge bases like OncoKB (Chakravarty et al., 2017) ; such connections entailed convincing the vendor to build the specific software modules. Finally, when we introduced an additional assay, namely pharmacogenomics (PGX) testing, basic interpretation of the data required combining genotypes at multiple loci in a gene into a single "star allele" diplotype for that gene. Even this basic requirement was entirely outside the purview of our installed commercial software and we had to look for other solutions including building a solution ourselves. 2 | MATERIAL AND METHODS In the past 5 years at NorthShore, Flype has supported PGX testing on two different internal platforms, receiving data from several commercial laboratories, enabled rapid scale up of partnership programs with vendor laboratories as well as down-scaling of other partnerships, incorporated tens of thousands of outside genetic tests including full EHR integration and has enabled the testing and launching of several in-house lab-developed NGS assays with custom software to support their data. The modules to accomplish all this can be divided into a user facing web portal and three distinct components on the backend: a relational database for storing data, a custom code base of bioinformatics pipelines and Concourse, a robust framework for connecting to a large number of external services. The most visible (user-facing) component of Flype ( Figure 1) We maintain a local copy of VEP to avoid sending patient information to a remote server, guard against network disruptions and to better control the version of VEP used in annotation. We also maintain local copies of reference databases from other important sources like ClinVar (Landrum et al., 2020) and gnomAD (Karczewski et al., 2020) population frequencies so that variants in these databases are matched to variants in Flype using our own software. This separation allowed us to keep the core gene and RefSeq annotation stable while updating these other sources more frequently. OncoKB (Chakravarty et al., 2017) , the IARC TP53 database (Bouaoun et al., 2016) and BRCAExchange (Cline et al., 2018) . Other connections with other potential partners are constantly being considered in this rapidly changing genomics world and Flype provides our organization with the unique ability to quickly launch tests and abandon those that prove unsatisfactory. The versatile connectivity of Concourse allowed Flype to help our institution streamline workflows in a surprising way during the recent COVID-19 pandemic (Manuscript in preparation). Flype satisfies a number of uses, some of which are described below. Its open and modular architecture has enabled NorthShore to grow our PMed program by helping launch PGX testing (including switching between PGX platforms), enabling rapid scale up to incorporate over 10,000 genetic tests from external labs and helping launch several in-house lab-developed assays for somatic variants. Flype can also be used in a research environment, without connecting to an EHR, to help with sample analysis. This workflow, illustrated in Figure 2, It is important to note that the results dashboard connects to an (Chakravarty et al., 2017) or specialty databases. In addition, Pathologists can choose to use an aggregated variant description (such as "KRAS activating" or "TP53 inactivating") instead of the individual variant where appropriate. These values are passed to the sign-out page, so that aggregated interpretations in Convo can be brought into the report when appropriate brand-new interpretation, it gets added to Convo and is available for the next sample in which the same variant with the same diagnosis may be encountered. Using all this information, the SP uses a drop-down next to each variant on the dashboard to assign it to be (a) included on the front page of the report, (b) included in the report as a VUS, or (c) not included in the report. The SP can also manually add findings from additional assays like findings from the ArcherDX (Boulder, CO) gene fusion assay. Finally, Flype displays a PDF report using the SP's detailed interpretations of all front-page variants and list of identified VUS as well as boiler plate information about the assay. Upon sign-out, this PDF report is sent to the EHR. Discrete data elements from the report are also available but, at present, we do not transmit them to the EHR for tumor genomic testing. Once a report has been signed out, the PDF can be reviewed but cannot be changed; however, users with SP privileges can "unlock" a signed-out sample to prepare an addendum or amendment. Pharmacogenomics is the application of PMed that promises to optimize drug therapy through knowledge of a patient's genome (Dunnenberger et al., 2016) . The PGX workflow in Flype ( Figure 5) follows a similar path to NGS and illustrates all eight of the requirements enumerated in Table 1 concern about the script based on the patient's PGX results; some of those concerns trigger an interruptive warning (Wake, Ilbawi, Dunnenberger, & Hulick, 2019) to complete the clinical decision support loop. Only authorized individuals can submit a patient report to the EHR, but any user can leverage the Kensa KB to create an anonymous report of prescribing recommendations by specifying diplotypes of various pharmacogenes on Flype. The Kensa KB is organized by genes, diplotypes and drugs metabolized by those genes. It is based upon standards developed by CPIC (Caudle et al., 2017) and PharmVar (Gaedigk, et al., 2019) and is maintained by permissioned members of the PGX team through Flype's Kensa page. F I G U R E 5 The workflow for in-house PGX testing is highlighted: sample flow starting with a browser upload of genotype data, which is processed by the Kensa algorithm, inserting diplotypes and genotypes into the variant database for each sample, which generates new entries in the Flype sample view available for the pharmacogenomicist to browse. After review, the pharmacogenomicist signs out the sample, which generates a PDF report as well as a set of discrete data elements, both of which are sent to the EHR. The sign-out button also generates an encrypted VCF which is sent to an external knowledge base partner Table 1 . The workflow starts when an order for DNA10K is submitted by a clinician through the EHR. Using Concourse, Flype receives this order and transmits it out to Color. Next, a blood specimen is shipped from the clinic to Color and when results are signed out at Color, Flype retrieves the results through a web services call. Color provides the results as discrete data as well as PDF reports; both of these are downloaded by Flype and transmitted to the EHR. Flype also repackages the PGX data into a VCF and sends that file to ActX (Seattle, WA), so that Color testing has the same level of clinical decision support as in-house PGX testing. Of 10 Color also provides continuous updates to patient results including reclassification of variants for example, from VUS to Likely pathogenic. Flype retrieves these updates from Color and transmits them to the EHR (requirement 8, Table 1 ). When we initiated Flype almost 5 years ago, the field appeared to be in a flux and we expected that by about now, it would have settled. We expected that the clinical use cases and workflows would be well understood and streamlined. Instead, what we find is that the use cases today are even more diverse than they were then. Even so, some general trends are strongly apparent. First, the dichotomy between clinically actionable findings and all other genomic findings is getting weaker with time. It is a common occurrence for patients to want to discuss with their Medical Geneticist their report from a consumer genomics company even though most of the data in those reports is not clinically actionable today. However, it is very clear that over time the actionable portion will grow. Therefore, for the foreseeable future, it is important to have a clearing house for holding genomic data, that is, a non-EHR destination which is capable for holding genomic findings considered nonactionable at the moment. Second, the $1,000 genome is still far away for clinical purposes. Even though it will be achieved in purely NGS technology terms, it is very unclear how a whole genome would be even represented, let alone utilized, in clinical care. Interfacing with multiple KBs, providing clinical decision support in the EHR and periodic update of variant status for all patients would be valuable. All three of these capabilities are bundled in with Flype and will be put to the test in the future as more and more genomic data are marked up as clinically relevant. Third, given the economics of genome sequencing technology, it appears likely that commercial NGS labs will remain a strong presence for the foreseeable future. With newer techniques like cell free DNA analysis (so-called liquid biopsies) making their appearance routinely, the need to interface with multiple labs will remain important. At F I G U R E 7 One of Flype's reporting tools are illustrated in this diagram, which shows representative Phenotype data for CYP2C19. The reporting tools also generate genotype-and diplotype-level views of PGX results and can illustrate performance of individual SNPs or alternate assays such as copy number which are useful for evaluating assay performance. Data are also segregated by assay type present, most labs only report clinically actionable findings. While this is understandable for a signed-out report which can only contain a reasonably small number of variants, we expect that labs will start reporting out (perhaps through an API layer) all variants that are reliably detected, whether or not those variants are clinically actionable at this moment. Making this a routine practice should allow appropriately equipped organizations to track the status of these variants and provide a more robust integration of genomic results into clinical practice. Writing modular, extensible code as a part of an ecosystem like Flype will allow organizations to respond to changing data standard formats (JSON, HL7, FIHR) and treat them as just another third-party connection to snap on to the existing framework. The plug-and-play aspect of Flype's Concourse module allows us to contemplate this and add or remove functionality in an agile paradigm. We present Flype, a web-based bioinformatics platform for use on an All authors are employees of NorthShore University HealthSystem. AACR project GENIE: Powering precision medicine through an international consortium TP53 variations in human cancers: New lessons from the IARC TP53 database and genomics data Standardizing terms for clinical pharmacogenetic test results: Consensus terms from the clinical pharmacogenetics implementation consortium (CPIC) OncoKB: A precision oncology Knowledge Base Adding genetic risk score to family history identifies twice as many high-risk men for prostate cancer: Results from the prostate cancer prevention trial A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3 BRCA challenge: BRCA exchange as a global resource for variants in BRCA1 and BRCA2 Django project. (n.d.). Django Project Implementation of a multidisciplinary pharmacogenomics clinic in a community health system Genomic medicine--an updated primer The evolution of PharmVar Lollipops in the clinic: Information dense mutation plots for precision medicine The mutational constraint spectrum quantified from variation in 141,456 humans ClinVar: Improvements to accessing data The Ensembl variant effect predictor Normalized names for clinical drugs: RxNorm at 6 years NorthShore and Color complete delivery of clinical genomics in routine care to 10,000 patients in the largest U.S. program to date, with plans to expand in 2020 Variant review with the integrative genomics viewer dbSNP: The NCBI database of genetic variation COSMIC: The catalogue of somatic mutations in cancer Pharmacogenomics: Prescribing precisely Pharmacogenomics knowledge for personalized medicine Legal framework for health cloud: A systematic review CrossMap: A versatile tool for coordinate conversion between genome assemblies Software for enabling personalized medicine The development of Flype has benefited from the continuous feedback from Drs. Mir B. Alikhan, Kathy A. Mangold, and Nora E. Joseph and from Mike Akroush, MS. We also acknowledge technical support from Charlie Cron, Eric Cron and the rest of the Unix team in our Health Information Technology department. The authors gratefully acknowledge the support from multiple anonymous Foundations. The source code is available upon request to the authors. Peter J. Hulick https://orcid.org/0000-0001-8397-4078