key: cord-0832778-uewlbc00
authors: Yin, Andrew L.; Guo, Winston L.; Sholle, Evan T.; Rajan, Mangala; Alshak, Mark N.; Choi, Justin J.; Goyal, Parag; Jabri, Assem; Li, Han A.; Pinheiro, Laura C.; Wehmeyer, Graham T.; Weiner, Mark; Safford, Monika M.; Campion, Thomas R.; Cole, Curtis L.
title: Comparing Automated vs. Manual Data Collection for COVID-Specific Medications from Electronic Health Records
date: 2021-10-21
journal: Int J Med Inform
DOI: 10.1016/j.ijmedinf.2021.104622
sha: d1544f49c182b42dd629fe52525c23ad4353bacf
doc_id: 832778
cord_uid: uewlbc00

INTRODUCTION: Data extraction from electronic health record (EHR) systems occurs through manual abstraction, automated extraction, or a combination of both. While each method has its strengths and weaknesses, both are necessary for retrospective observational research as well as sudden clinical events, like the COVID-19 pandemic. Assessing the strengths, weaknesses, and potentials of these methods is important to continue to understand optimal approaches to extracting clinical data. We set out to assess automated and manual techniques for collecting medication use data in patients with COVID-19 to inform future observational studies that extract data from the electronic health record (EHR). MATERIALS AND METHODS: For 4,123 COVID-positive patients hospitalized and/or seen in the emergency department at an academic medical center between 03/03/2020 and 05/15/2020, we compared medication use data of 25 medications or drug classes collected through manual abstraction and automated extraction from the EHR. Quantitatively, we assessed concordance using Cohen’s kappa to measure interrater reliability, and qualitatively, we audited observed discrepancies to determine causes of inconsistencies. RESULTS: Of the 16 inpatient medications, 11 (69%) demonstrated moderate or better agreement; 7 of those demonstrated strong or almost perfect agreement. Of 9 outpatient medications, 3 (33%) demonstrated moderate agreement, but none achieved strong or almost perfect agreement. We audited 12% of all discrepancies (716/5,790) and, in those audited, observed three principal categories of error: human error in manual abstraction (26%), errors in the extract-transform-load (ETL) or mapping of the automated extraction (41%), and abstraction-query mismatch (33%). CONCLUSION: Our findings suggest many inpatient medications can be collected reliably through automated extraction, especially when abstraction instructions are designed with data architecture in mind. We discuss quality issues, concerns, and improvements for institutions to consider when crafting an approach. During crises, institutions must decide how to allocate limited resources. We show that automated extraction of medications is feasible and make recommendations for how to improve for future iterations.

Data collection from electronic health record (EHR) systems may be conducted through manual abstraction, automated extraction, or a combination of both. Manual abstraction, which involves trained personnel reviewing patient charts and completing case report forms, is often considered the gold standard for retrospective observational research. Many variables require manual adjudication by clinically trained personnel, depending on the complexity of institutional workflows, clinical questions, medical record structure, etc. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] . Notably, Flatiron Health, with its data set for analytics generated through manual review, sold for nearly $2 billion to pharmaceutical company Roche, demonstrating the significant value of manually abstracted data [11] . Although considered the gold standard, manual abstraction has limitations as human reviewers are not infallible and can be less accurate in certain cases [12] [13] [14] [15] [16] [17] . Importantly, manual abstraction consumes significant time for clinically trained personnel who are needed for patient care and other capacities, especially during times of crisis as occurred in the COVID -19 pandemic.

To address these challenges, studies have demonstrated that automated data extraction from the EHR, which involves direct database queries, can produce data sets of similar quality to manual abstraction for certain variables while saving time for study teams and reducing error [16, [18] [19] [20] [21] [22] [23] [24] . Even so, automated extraction is similarly susceptible to data quality issues relating to high complexity or fragmentation of data across many EHR systems [25] [26] [27] [28] . Manual abstraction and automated extraction both ultimately depend on the EHR, which is not an objective, canonical source of truth but rather an artifact with its own bias, inaccuracies, and subjectivity [29] [30] [31] [32] [33] [34] [35] . While this previous work has explored these concepts, optimal approaches for acquiring data from EHR systems for research are unknown.

Alongside other academic medical centers, our institution, Weill Cornell Medicine, deployed informatics to support COVID-19 pandemic response efforts [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] . This included systematic data collection from the EHR, which at our institution occurred through a combination of manual abstraction and automated extraction. Prior investigations have compared manual to automated data collection techniques in conditions other than COVID-19, described informatics resources specific to COVID-19 [47] [48] [49] , and evaluated the performance of automated extraction in COVID-19 [47] [48] [49] . Such evaluations of automated extraction have included problem lists as well as natural language processing extracting signs and symptoms [50] [51] [52] [53] . To the best of our knowledge, no studies have compared manual to automated data acquisition of medications in COVID-19 data. Medication use data is critical to studying new diseases, especially concerning their risk factors and outcomes following certain treatments. For example, ACE inhibitors were debated early in the pandemic due to concerns about their role in exacerbating disease [54] . We sought to quantitatively assess concordance between manual abstraction and automated extraction of EHR data for inpatient and outpatient medications using Cohen's kappa while also qualitatively reviewing instances of discordance to understand sources of error, similar to previous work [15] [16] [17] [18] 23] . The COVID-19 pandemic uniquely fueled parallel database creation through both manual and automated methods given the dire need for information. In turn, this enables us to compare these methods in ways not done previously given the number of patients included, number of medications included, and the parallel creation of the databases. Through our comparison and suggestions, we hope to support institutions in improving data collection methods and in allocation of resources in future efforts, whether related to COVID or other clinical scenarios [49, 55] .

This retrospective observational study occurred at Weill Cornell Medicine (WCM), the biomedical research and education unit of Cornell University, which has 1,600 attending physicians with faculty appointments in the WCM Physician Organization and admitting privileges to NewYork-Presbyterian. Affiliated facilities included NewYork-Presbyterian/Weill Cornell Medical Center (NYP/WCMC), an 862-bed teaching hospital; NewYork-Presbyterian Hospital/Lower Manhattan Hospital (NYP/LMH), a 180-bed community hospital; and NewYork-Presbyterian/Queens (NYP/Q), a 535-bed community teaching hospital. In inpatient and emergency settings, clinicians used the Allscripts Sunrise Clinical Manager EHR system. In outpatient settings, NYP/WCMC and NYP/LMH clinicians used the Epic EHR system while NYP/Q clinicians used the Athenahealth EHR system. The study period was 03/03/2020 (date of first COVID-positive admission to a WCM campus) to 05/15/2020. The WCM Institutional Review Board approved this study (#20-03021681).

To support institutional pandemic response efforts, we created the COVID Institutional Data Repository (IDR), comprised of data retrieved through manual abstraction and automated extraction from EHR systems. The COVID IDR used existing institutional infrastructure for secondary use of EHR data, including Microsoft SQL Server-based pipelines for data acquisition from ambulatory and inpatient EHR systems and research systems, described in prior work [56] .

As illustrated in Figure 1 , the IDR includes many clinical domains and was designed to support diverse use cases, including clinical operations, quality improvement, and research with processes to determine access and oversee regulatory approval [56] . Data collected through manual abstraction and automated extraction in the IDR are ultimately derived from the EHR, which, as demonstrated by Hripcsak et al, constitutes an imperfect proxy for the true underlying patient state [30] . Inpatient and outpatient medication data examined in this study were derived from a few different sources. Inpatient medication data was derived from the medication administration record present in Allscripts SCM, the electronic health record system in use at the time of the pandemic. Outpatient medication data was derived from a combination of free text mentions in clinical notes, "historical medication" orders entered into the EHR as the result of medication reconciliation at the time of admission, and prescriptions entered into the EHR either at discharge or at an ambulatory care visit. 

A team of clinicians (PG, JC, HL, GW, MA, MS) identified data elements in the EHR and created a REDCap case report form [57] . The team provided training to furloughed medical students and other clinicians (the abstractors) on abstraction methods [58] . A daily query identified patients based on the inclusion criteria (admitted to or seen in the emergency department (ED) at NYP/WCMC, NYP/LMH, or NYP/Q AND positive RT-PCR for SARS-CoV-2).

Abstractors followed these patients (n=4,123) through their entire hospitalization until discharge, including any subsequent encounters for any of these same patients who presented to the ED again or were readmitted (n=4,414 visits). The case report form included 14 sections: patient information, comorbidities, symptoms, home (outpatient) medications, ED course, mechanical ventilation, ICU stay, discharge, imaging, disposition, complications, testing, inpatient medications, and survey status. Reviewers relied principally on the inpatient EHR. For portions of the case report form regarding medications, abstractors answered a mix of binary ("yes" or "no") and check box ("check all that apply") questions. Medications were abstracted at the drug class level (e.g. Statins) or individual level (e.g. Hydroxychloroquine). To determine inpatient medication exposure, abstractors used structured order entry and medication administration record data to determine whether a patient received a given medication. To determine outpatient medication exposure (those drugs prescribed prior to the hospitalization), manual abstractors relied on outpatient medication orders from the ambulatory EHR system, mentions of drug exposure in clinical notes, and "historical medication" orders entered into the inpatient order-entry system as a result of medication reconciliation after admission. The case report form listed examples of medications from drug classes for abstractors to reference. As a quality check prior to initial publication of registry data, a second abstractor reviewed 10% of records, calculating mean Cohen's kappas of 0.92 and 0.94 for categorical and continuous variables, respectively. Tables 2 and 3 in Appendix A show results from this secondary extraction in the "Validation" columns. The entirety of these methods has been described previously [36] . For the manual abstraction data dictionary, see Appendix B.

Automated extraction captured data for all patients tested for SARS-CoV-2 and/or diagnosed with COVID-19 as documented by EHR systems in the study period. Data were transferred from their underlying raw format (vendor-specific proprietary EHR data models) and loaded into a Microsoft SQL Server database designed with a custom schema. Data were stored in tables corresponding to clinical domains (Appendix C). Instead of using an existing common data model (CDM) such as the Observational Medical Outcomes Partnership (OMOP) or PCORnet CDM, we used a simplified format based on OMOP to include data elements not always assigned reference terminology in EHR source data and to present data in keeping with clinical preconceptions (e.g. separating in-hospital medication administration from outpatient prescriptions to preserve their distinct provenance and usage rather than combining both into a single table) [59, 60] .

First, we characterized both constituent elements of the IDR: its manually abstracted and automatically extracted components. For both data sets, we determined how many patients were included, tallied total observations, and determined basic demographic characteristics. Second, for all patient visits (as some patients had multiple hospitalizations) with data collected through both manual abstraction and automated extraction, we quantitatively assessed agreement between the methods using Cohen's kappa. Third, we audited a subset of discrepancies between the results of the automated and manual processes for each medication or drug class to determine the underlying error. Of note, in this comparison of manual abstraction and automated extraction, we did not assume either to be the gold standard, instead seeking objective strengths and weaknesses of each approach and the concordance between their data. Past studies have used this approach to compare strengths and weaknesses of manual and automated data collection [15] [16] [17] [18] 23] . Individual medications and drug classes were studied to better understand both common and specific causes of discrepancies within the two methods and allow for more targeted recommendations for improvement.

Because the data set formats differed, both required transformation before comparison.

For example, medication data from manual abstraction were stored as dichotomous variables on a per-patient basis, while automated extraction stored them on a per-order basis. In order to compare, we developed Structured Query Language (SQL) queries with outputs displaying the presence or absence of agreement between the two methods (example of query output displayed in Appendix A, Table 1 ). SQL code for queries is available on Github (https://github.com/wcmcresearch-informatics/covid_comparison). Queries were designed to align with the instructions given to manual abstractors. For example, they did not use RxNorm-derived definitions of drug classes, such as the National Drug File-Reference Terminology (NDF-RT) or Anatomical Therapeutic Chemical (ATC) hierarchies, instead using generic names from the manual abstractors' instructions and clinical discretion of members of the research team (AY, WG, PG, JC) who participated in the manual abstraction. Queries identified administered inpatient medications from the automated extraction database if the date of administration fell within the dates of a given hospitalization. Queries identified outpatient medications based on whether the medication was actively prescribed for a patient at the time of admission to the hospital. It is important to note that a closed system to guarantee medication administration does not exist in the outpatient setting as it does in the inpatient setting.

For each medication or drug class in the query (both inpatient and outpatient), we calculated Cohen's kappa (κ), a statistic commonly used to measure interrater reliability, to quantify agreement between data obtained through manual abstraction and automated extraction.

We used the scale developed by McHugh to determine strength of agreement based on the following thresholds: "almost perfect" (κ > 0.9), "strong" (0.9 > κ > 0.8), "moderate" (0.8 > κ > 0.6), "weak" (0.6 > κ > 0.4), "minimal" (0.4 > κ > 0.2), and "none" (0.2 > κ > 0.0) [61] . 95% CIs were calculated for each medication. To assess whether having previous records in our system improved data quality, we calculated κ values based on whether patients had EHR documentation of a prior outpatient, inpatient, or emergency visit to our healthcare system. To account for different prevalence in medications (i.e. some had very low prevalence) in the data set, we calculated the prevalence-adjusted bias-adjusted kappa (PABAK) and prevalence index (PI) [62] .

For each medication or drug class, we randomly audited 10% of the identified discrepancies or 20 discrepancies, whichever was greater. Two members of the research team (AY and WG) adjudicated discrepancies in the results of the manual abstraction and automated extraction to determine the cause of error and determine the correct output. In order to do this, they reviewed both information from inpatient and outpatient EHRs as well as the manual abstraction and automated extraction data. ES adjudicated cases of disagreement between AY and WG. We classified discrepancies into three principal categories based on whether each was attributable to error in the manual abstraction, in the automated extraction, or errors that could not be attributed to either method specifically. Respectively, these categories are: 1) human error in manual abstraction, 2) automated extraction error in the extract-transform-load (ETL) or mapping process, or 3) unattributable error due to abstraction-query mismatch between the instructions supplied to manual abstractors and the extraction query that flattened automated data for comparison. Although designed closely, the manual abstraction process was designed first, with instructions that did not necessarily consider logic of automated extraction. We calculated descriptive statistics of the distribution of error types across these three categories.

During the study period, manual abstraction yielded data for 4,123 patients while automated extraction collected data for 24,944 patients, including the 4,123 patients from manual abstraction. Table 1 All 25 medications (16 inpatient and 9 outpatient) in the manual abstraction process were included. The 4,123 patients identified in the manual abstraction process had a total of 4,414 visits, as some patients had multiple hospitalizations during the study period. Based on the manual abstraction, the percent of visits receiving a certain medication or drug class ranged from 0% to 60% (e.g. protease inhibitors in 0 of 4,414 visits; hydroxychloroquine in 2,656 of 4,414 visits). Prevalence of each medication or drug class in the manual abstraction and automated extraction can be seen in Table 2 . For all counts, percentages, and a more detailed breakdown of the data please see Appendix A, Tables 2 and 3 . 

For inpatient medications, we compared data from 16 different medications or drug classes and report the resulting Cohen's kappa (κ) values for each category in Figure 2 . To see the data used to calculate κ values, see Appendix A, Table 2 . Based on McHugh's benchmark, agreement between manual abstraction and automated extraction was almost perfect for 3 (19%) inpatient medications, strong for 4 (25%), moderate for 4 (25%), weak for 1 (6%), and minimal or none for 4 (25%). The median κ for inpatient medications was 0.75.

Data for each medication category are shown as "All" (includes all 4,123 patients; with 95% CIs overlaid), and then as two separate groups (PE: prior exposure to our health system; NPE: no prior exposure).

For outpatient medications, we compared data from 9 different medications or drug classes and report the resulting Cohen's kappa (κ) values in Figure 3 . To view the data used to calculate the κ values, see Appendix A, Table 3 . Based on McHugh's benchmark, agreement between manual abstraction and automated extraction was moderate for 3 (33%) outpatient medications, weak for 3 (33%), minimal for 2 (22%), and none for 1 (11%). The median κ for outpatient medications was 0.56. 

We audited 716 discrepancies, representing 12.37% of the 5,790 total discrepancies detected. For inpatient medications, we audited 13.2% (346/2,621) of all discrepancies and found that in 31% (107/346) the automated extraction was correct and there was human error in the manual abstraction, in 27% (94/346) ETL or mapping error led to error in the automated extraction, and in 42% (145/346) abstraction-query mismatch occurred where the logic in the automated extraction did not match that of the manual abstraction. Figure 4 shows the breakdown of errors by individual inpatient medications and drug classes. Table 3 . In 10% of discrepancies, complex drug classes or questions led manual reviewers to classify patients as having been exposed to a medication when they were not Patient classified by manual abstractor as exposed to NSAIDs despite only receiving acetaminophen (a non-NSAID drug) during hospitalization ETL/Mapping Error Missing data leading to query error In 31% of discrepancies, data missing in the EHR led to the query incorrectly categorizing patients as having continued exposure to a given drug Outpatient medications were only included by manual abstractors if the patient was exposed based on admission documentation, but many orders in the outpatient EHR lack end dates, requiring further work for appropriate automated calculation Local errors

In 5% of discrepancies, issues with missing reference terminology in source systems caused failure to detect some medications during automated extraction of data Remdesivir and sarilumab were not coded to RxNorm vocabulary due to investigational status and exposures to these drugs were not detected in the automated extracted data Patient identifier inconsistency In 4% of discrepancies, patient identifiers were either missing or incorrect, leading to discrepancies in specific drug exposures Two patients shared the same enterprise master patient index, resulting in conflation of their data

In 1% of discrepancies, data were not mapped correctly between differing hospital campuses, which led to incorrectly classified drug classes in the data extracted by automated methods

The formulary from one hospital was mapped to the formulary from another, yielding incorrect classes for some drugs

Mismatch between query and data format In 30% of discrepancies, for inpatient medications where duration of administration was important, the query overlooked medications that were ordered daily as opposed to ordered continuously as order duration was used to measure duration of administration.

Diuretics were commonly ordered as single doses each day, thus although a patient could receive diuretics for consecutive days, the query only detected doses as having a 24-hour duration when instructions for manual abstraction asked for a minimum 48-hour duration. Complex instructions/confounding medications In 3% of discrepancies, lack of clarify in some special instructions for specific medication categories created challenges in developing the query Manual abstractors were instructed to only capture protease inhibitor exposure if the drug was part of an HIV regimen -the automated extraction method and query did not take this into account. Confounding medication names also led to inappropriate inclusion. 

In a comparison of data collected through manual abstraction and automated extraction for COVID-19 patients at the height of the pandemic, we observed that automated extraction performed equal to manual abstraction for many inpatient medications and poorly for most outpatient medications. This suggests that future efforts to collect inpatient medication data need not rely on manual abstraction, allowing institutions to direct valuable human resources toward other needs.

For inpatient medications, 44% (7/16) medications or drug classes reached strong agreement or higher, many of which were the more prevalent medications (e.g. statins, hydrochlorothiazide, etc.). The 25% (4/16) of inpatient medications with minimal to no agreement (Cohen's kappa <0.4) were due to formulary related errors in the ETL/mapping of the automated extraction (remdesivir, sarilumab), or infrequent clinical usage (protease inhibitors, lopinavir/ritonavir) making kappa incalculable. Query related challenges in matching time-specific manual extraction instructions affected diuretics, NSAIDs, statins, antibiotics, and ACEi/ARBs.

Of the 9 outpatient medications and drug classes, 33% (3/9) reached moderate agreement, while none achieved strong or almost perfect agreement. Based on our audit, these outcomes were driven by poor data quality in the EHR. Outpatient medications are not recorded with the same rigor of inpatient medications given the inability to truly confirm if a patient is taking a prescription.

Additionally, outpatient medications often lack "end dates" in the EHR as medications commonly go unreconciled or have undefined order lengths, thus the query categorized these as "active" (the alternative being broad under-detection of home medications), even if the medication was neither included in admission documentation nor recorded by manual abstraction.

Interrater reliability was not consistently increased or decreased in patients with previous visits to our health system for inpatient or outpatient medications (i.e. patients with a previous visit did not consistently have a higher kappa value). This variation likely stems from a double-edged benefit: more data enriches the EHR, but entries of previous medications that have poor/missing documentation lead to erroneous detection.

In considering data sources for retrospective observational research, manual abstraction is often considered the gold standard. Although manual abstractors can navigate interface errors, read free text, and generally benefit from use of clinical judgement to interpret data, they are a limited resource and susceptible to human error [12] [13] [14] . Automated abstraction, while theoretically capable of flawlessly mirroring data in the EHR, is subject to several prominent issues impacting the utility and fitness of the data for secondary use [25, 26, [29] [30] [31] [32] . Because we observe errors in both methods, our findings suggest there is not a one-size-fits-all solution to generating research data sets using the EHR and that the conceptualization of "data quality" should be expanded in these contexts to better consider the provenance of the data in question and the nature of the downstream use case. Inpatient medications were thoroughly documented in the EHR, and automated extraction techniques performed well, suggesting manual efforts could target other areas such as outpatient medications or other domains requiring interpretation of context or setting such as provider notes. We believe that many observed issues could be improved in future work by designing studies to account for automated extraction logic. In Table 4 , we suggest ways to address common errors presented in Table 3 . 

Work with existing health information management teams within the clinical informatics domain to address observed issues in identity management. Cross institutional differences Ensure that differences between EHR systems at subsites of large hospital systems are properly addressed before incorporating their data. Implement "sanity checks" on mappings to identify errors before pushing new data.

Mismatch between query and data format 

While previous work on COVID-19 data has evaluated the quality of extraction for problem lists and natural language processing for signs and symptoms [50] [51] [52] [53] , to our knowledge, our work is the first evaluating manual abstraction and automated extraction of EHR medication data relevant to COVID-19. Previous studies comparing manual abstraction and automated extraction have usually reviewed fewer patients or focused on theoretical principles [16, [18] [19] [20] [21] [22] [23] [24] [25] . To support future studies of this kind-particularly recent needs to create a widelyavailable data set of COVID-19 cases such as N3C-we hope this work provides a roadmap and highlights new variables eligible for automated extraction with high accuracy (i.e. inpatient medications), allowing valuable clinical resources needed for manual abstraction to be redirected toward other domains [47] [48] [49] 63] . Similarly, automated extraction methods as demonstrated here can be the foundation for more closely adhering to certain best practice methods for data quality standards and assessment [64, 65] .

The current study demonstrates the viability of automated extraction of many inpatient 

This study has certain limitations. Although the findings and implications of this work are more broadly applicable, both abstraction methods were tailored to our hospital system. Other hospital systems should create queries specific to their own data architecture. Second, although the audit process was extensive, the process focused on discrepancies rather than a complete random audit of all results. Certain errors may have existed among agreeing results (e.g. manual and automated approaches may have both detected a medication that a patient didn't receive or vice versa). The Cohen's kappa calculation is unaffected by this and there may be other errors that were not characterized. Third, systematic data quality issues may have affected certain patient populations, such as reduced clinical resources in certain hospitals during the peak of the crisis. Although extracting across 3 different hospitals is an overall strength of the study, it also introduces variation in the staff, procedures, and patient populations. Future work hopes to deliver similar analyses for other data points such as comorbidities, outcomes, etc. Future work will also explore some concepts that are outside of the scope of this current work, including both further pursuit of data quality improvement topics in Table 4 as well as better characterization of the timeline and context in which these errors in the data occur.

COVID-19 has changed the landscape of healthcare, creating opportunities to improve data infrastructures. The current study assessed agreement on outpatient and inpatient medication exposure between data collected through manual abstraction and automated extraction, exploring underlying causes of discrepancies and offering ways to avoid them. It demonstrates that automated collection of medication data is feasible and, for many inpatient medications, could save time and resources required for manual abstraction. As with many institutions during this pandemic, institutions must make tough decisions about where to allocate resources. This work outlines quality issues for institutions to be aware of and improvements that could be made.

 Data extraction from electronic health record (EHR) systems occurs through manual abstraction, automated extraction, or a combination of both, with each process having its strengths and weaknesses depending on setting and data type.

 Prior investigations have compared manual to automated data collection techniques in conditions other than COVID-19, described informatics resources specific to COVID-19

and evaluated the performance of automated extraction in COVID-19, including problem lists and natural language processing to extract signs and symptoms.

 Automated extraction performed equal to manual abstraction for many inpatient medications and poorly for most outpatient medications, suggesting future efforts to collect inpatient medication data need not rely on manual abstraction and allowing institutions to direct valuable human resources toward other needs.

 Both automated extraction and manual abstraction have strengths and weaknesses that must be considered in any data extraction effort. Institutions can now be more aware of the potential trade-offs and areas of improvement. 

The retrospective chart review: important methodological considerations

Methodology to improve data quality from chart review in the managed care setting

Medical Record Review Conduction Model for Improving Interrater Reliability of Abstracting Medical-Related Information

Data Quality in Clinical Research

Ensuring high accuracy of data abstracted from patient charts: the use of a standardized medical record as a training tool

Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review

Evaluating the state of the art in coreference resolution for electronic medical records

i2b2/VA challenge on concepts, assertions, and relations in clinical text

Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review

The Deeply Human Core of Roche's $2.1 Billion Tech Acquisition --And Why It Made It

Factors Affecting Accuracy of Data Abstracted from Medical Records

Transcription Error Rates in Retrospective Chart Reviews

A comparative effectiveness study of eSource used for data capture for a clinical research registry

Effects of blood glucose transcription mismatches on a computer-based intensive insulin therapy protocol

Secondary EMR data for quality improvement and research: A comparison of manual and electronic data collection from an integrated critical care electronic medical record system

Assessing Data Quality in Manual Entry of Ventilator Settings

Electronic Versus Manual Data Processing: Evaluating the Use of Electronic Health Records in Out-of-hospital Clinical Research

Using automated electronic medical record data extraction to model ALS survival and progression

Comparison of manual versus automated data collection method for an evidence-based nursing practice study

Detection of Pharmacovigilance-Related Adverse Events Using Electronic Health Records and Automated Methods

Comparison of computerized surveillance and manual chart review for adverse events

Quality of EHR data extractions for studies of preterm birth in a tertiary care center: guidelines for obtaining reliable data

ReCAP: Feasibility and Accuracy of Extracting Cancer Stage Information From Narrative Electronic Health Record Data

Caveats for the use of operational electronic health record data in comparative effectiveness research

Recommendations for the use of operational electronic health record data in comparative effectiveness research

A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data

Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus

Secondary Use of EHR: Data Quality Issues and Informatics Opportunities

Next-generation phenotyping of electronic health records

Sick patients have more data: the non-random completeness of electronic health records

Defining and measuring completeness of electronic health records for secondary use

Towards augmenting structured EHR data: a comparison of manual chart review and patient self-report

Accuracy of Electronically Reported "Meaningful Use

Clinical Characteristics of Covid-19 in

Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the

Clinical informatics during the COVID-19 pandemic: Lessons learned and implications for emergency department and inpatient operations

Creating and implementing a COVID-19 recruitment Data Mart

ELII: A Novel Inverted Index for Fast Temporal Query, with Application to a Large Covid-19 EHR Dataset

Clinical Features of 85 Fatal Cases of COVID-19 from Wuhan. A Retrospective Observational Study

Acute cerebrovascular disease following COVID-19: a single center, retrospective, observational study

Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study

Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study

Hospitalization and Mortality among Black Patients and White Patients with Covid-19

Association of Treatment With Hydroxychloroquine or Azithromycin With In-Hospital Mortality in Patients With COVID-19

Rapid response to COVID-19: health informatics support for outbreak management in an academic health system

Pandemic as a Catalyst for Rapid Implementation

Use of electronic health records to support a public health response to the COVID-19 pandemic in the United States: a perspective from 15 academic medical centers

Data gaps in electronic health record (EHR) systems: An audit of problem list completeness during the COVID-19 pandemic

COVID-19 SignSym: a fast adaptation of a general clinical NLP tool to identify and normalize COVID-19 signs and symptoms to OMOP common data model

Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework

ConceptWAS: A highthroughput method for early identification of COVID-19 presenting symptoms and characteristics from clinical notes

Renin-Angiotensin-Aldosterone System Inhibitors and Risk of Covid-19

Biomedical and health informatics approaches remain essential for addressing the COVID-19 pandemic

Secondary Use of Patients' Electronic Records (SUPER): An Approach for Meeting Specific Data Needs of Clinical and Translational Researchers., AMIA ... Annu. Symp. Proceedings. AMIA Symp. 2017

Research electronic data capture (REDCap)-A metadata-driven methodology and workflow process for providing translational research informatics support

Medical Students as Essential Frontline Researchers During the COVID-19 Pandemic

Launching PCORnet, a national patient-centered clinical research network

Advancing the Science for Active Surveillance: Rationale and Design for the Observational Medical Outcomes Partnership

Interrater reliability: the kappa statistic

The disagreeable behaviour of the kappa statistic

The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment

Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research

A Data Quality Assessment Guideline for Electronic Health Record Data Reuse

The authors would like to acknowledge the large institutional efforts that enabled this work and the individuals who advised, shaped, and supported this manuscript.

The authors have no competing interests to declare.  Many inpatient medications can be collected reliably through automated extraction  Data quality issues in outpatient medications impede automated extraction  Performance differences are driven by data abstraction architecture and data sources  Automated extraction has the potential to save valuable resources in clinical crises My colleagues and I are excited to submit "Comparing Automated vs. Manual Data Collection for COVID-Specific Medications from Electronic Health Records" for consideration for publication in the International Journal of Medical Informatics (IJMI). In this study, we assessed performance of manual abstraction and automated extraction of COVID-19 medication data from our institutional electronic health record (EHR) systems, assessing relative performance quantitatively through interrater reliability and qualitatively through a random audit of discrepancies. For the 4,123 patients in this study, we found that many inpatient medications could be extracted automatically, while manual abstraction better supported almost all outpatient medications. With our qualitative assessment, we identified the cause of many discrepancies between manual abstraction and automated extraction and ways to improve upon data collection for COVID-19 and other efforts.We believe the study is well-suited for the IJMI audience's continued focus on innovation in clinical and research information systems, electronic medical record data quality, and building of sustainable infrastructures for the future.Of note, this work includes a consortium of medical student authors who were responsible for the acquisition and creation of the manually abstracted data set, which reflects the importance of their contribution at the height of the pandemic. We look forward to your response. In the interest of transparency, we ask you to disclose all relationships/activities/interests listed below that are related to the content of your manuscript. "Related" means any relation with for-profit or not-for-profit third parties whose interests may be affected by the content of the manuscript. Disclosure represents a commitment to transparency and does not necessarily indicate a bias. If you are in doubt about whether to list a relationship/activity/interest, it is preferable that you do so.The following questions apply to the author's relationships/activities/interests as they relate to the current manuscript only.

In item #1 below, report all support for the work reported in this manuscript without time limit. For all other items, the time frame for disclosure is the past 36 months.