key: cord-0781571-hwerg2f4
authors: Meeker, Daniella; Friedberg, Mark W.; Knight, Tara K.; Doctor, Jason N.; Zein, Dina; Cayasso-McIntosh, Nancy; Goldstein, Noah J.; Fox, Craig R.; Linder, Jeffrey A.; Persell, Stephen D.; Dea, Stanley; Giboney, Paul; Yee, Hal F.
title: Effect of Peer Benchmarking on Specialist Electronic Consult Performance in a Los Angeles Safety-Net: a Cluster Randomized Trial
date: 2021-09-09
journal: J Gen Intern Med
DOI: 10.1007/s11606-021-07002-1
sha: 67f82cef399c31f37822422833ac490417ca78e7
doc_id: 781571
cord_uid: hwerg2f4

BACKGROUND: Since the advent of COVID-19, accelerated adoption of systems that reduce face-to-face encounters has outpaced training and best practices. Electronic consultations (eConsults), structured communications between PCPs and specialists regarding a case, have been effective in reducing face-to-face specialist encounters. As the health system rapidly adapts to multiple new practices and communication tools, new mechanisms to measure and improve performance in this context are needed. OBJECTIVE: To test whether feedback comparing physicians to top performing peers using co-specialists’ ratings improves performance. DESIGN: Cluster-randomized controlled trial PARTICIPANTS: Eighty facility-specialty clusters and 214 clinicians INTERVENTION: Providers in the feedback arms were sent messages that announced their membership in an elite group of “Top Performers” or provided actionable recommendations with feedback for providers that were “Not Top Performers.” MAIN MEASURES: The primary outcomes were changes in peer ratings in the following performance dimensions after feedback was received: (1) elicitation of information from primary care practitioners; (2) adherence to institutional clinical guidelines; (3) agreement with peer’s medical decision-making; (4) educational value; (5) relationship building. KEY RESULTS: Specialists showed significant improvements on 3 of the 5 consultation performance dimensions: medical decision-making (odds ratio 1.52, 95% confidence interval 1.08–2.14, p<.05), educational value (1.86, 1.17–2.96) and relationship building (1.63, 1.13–2.35) (both p<.01). CONCLUSIONS: The pandemic has shed light on clinicians’ commitment to professionalism and service as we rapidly adapt to changing paradigms. Interventions that appeal to professional norms can help improve the efficacy of new systems of practice. We show that specialists’ performance can be measured and improved with feedback using aspirational norms. TRIAL REGISTRATION: clinicaltrials.gov NCT03784950 SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11606-021-07002-1.

Inadequate communication, consultation, and coordination between providers contribute to poor health outcomes and increased healthcare costs. Specialists can play important roles in patient care both directly (by seeing patients themselves) and indirectly by collaborating with primary care practitioners (PCPs) in patient-centered care. 1, 2 Electronic consultation (eConsult) programs have been shown effective in reducing face-to-face specialist encounters through provision of timely and relevant input from specialists to PCPs on treatment or diagnosis. [3] [4] [5] Promoting improved communication between PCPs and specialists through eConsults might increase system capacity, extend PCP capability, improve access, reduce wait times, contain costs, and reduce the risk of healthcareassociated outpatient clinic transmission of coronavirus and other infectious disease. 3, [5] [6] [7] [8] [9] [10] [11] Effective intraprofessional communication is a critical component of healthcare. 12, 13 A good interprofessional consultation can include bidirectional teaching that shares specialized knowledge about specific conditions (from specialist to PCP) and holistic longitudinal knowledge of the patient (from PCP to specialist). 14, 15 However, specialists' performance can vary widely, which may be attributable to the level of specialist engagement in the eConsult platform. 9, 16 Prior research has investigated PCP perceptions of the value of eConsult, 3, 7, 17 but strategies for improving consultation skills have been under-studied, and effective approaches have yet to be identified. 18, 19 The pandemic has accelerated adoption of systems that reduce contact with patients, with little opportunity to develop necessary measurement and training.

Behavioral science suggests promising approaches to improve the quality of eConsults. "Nudges" are interventions that perturb behavior in a predictable way without restricting options or altering economic incentives. 20 One such nudge leverages the principle that people tend to conform to the behavior and/or expectations of their peers, [21] [22] [23] especially when expectations about behavior are otherwise unclear, as in electronic communication. 24 Communicating feedback about how one's peers expect one to behave provides pressure to meet this expectation. Likewise, the process of evaluating a peer can provide impetus to improve one's own performance. 25 We study both types of peer effects.

Providing normative feedback about performance relative to peers has led to subsequent improvements in productivity and quality of care. 22, 26, 27 Feedback based on peers' evaluations may be seen as more valid than standardized metrics, potentially conferring greater accountability of the recipient. 27, 28 Peers' judgment of performance changes behavior based on mutual intraspecialty and intra-organizational trust. 29 We have applied this approach to better understand how nudges might improve communication between PCPs and specialists using the eConsult system. We conducted the first randomized trial designed to assess the extent to which specialists' exposure to peer ratings enhances consultation quality.

The Los Angeles County Department of Health Services (LADHS) eConsult clinician steering committee collaborated with health services researchers and behavioral scientists to design the intervention, leveraging evidence that feedback is effective when it includes aspirational norms from social "in-groups" of professional peers. 22, 30, 31 Because the act of rating other specialists' eConsults is itself an intervention, the trial was designed to adjust for the impact of rating others. In the first phase of the trial, only one group provided ratings. In the second phase, the original rating group and a second naïve group received feedback about their performance, and a third naïve group began the rating process. By comparing rated eConsults of raters to non-raters in the first phase, we can observe the impact of simply providing ratings to other providers without receiving feedback (see Section 4 of trial protocol; Appendix 1).

LADHS is the second largest public health provider in the USA. It is an integrated system with 23 health centers, 4 hospitals, and a network of referring federally qualified health centers serving about 670,000 patients annually. The LADHS eConsult platform allows PCPs to direct requests to a specific specialty for guidance and referrals through asynchronous messages. 9 EConsult requests are reviewed by specialist reviewers recruited by DHS leadership. PCPs describe the case, optionally upload documentation, and ask questions of the assigned specialist. Specialists respond to eConsults in free text and may include pre-composed content. Prior to this trial, specialists were encouraged by institutional leadership to pursue 5 aspirational principles: responsiveness, collaboration, equity, customer service, and effective practice. However, until this trial, specialists only received feedback on their eConsult productivity (i.e., timeliness and number of eConsults completed per month).

We developed and tested a rating instrument that reflected institutional and local professional eConsult practice standards. We interviewed 10 subspecialists with high volumes of eConsults about characteristics they considered to be markers of high-quality eConsults (see interview guide; Appendix 2). Using interviewees' examples of eConsults that they considered "bad" and "good" interactions and their rationale for these designations, we devised a set of five performance dimensions with established methodology. 32 These included efforts to elicit additional information from PCPs (when needed), two aspects of medical decision-making (adherence to guidelines or "Expected Practices," which provide consistent and targeted decision support in efforts to achieve clinical practice standardization 33 or agreement with decision-making when no guideline applied), inclusion of educational content for the PCP, and collegiality (i.e., strengthening or weakening the interpersonal relationship between PCP and subspecialist). These dimensions overlap with those identified in other recent studies. 15 The rating instrument included gate questions ensuring that only relevant consults were rated. One investigator (MWF) drafted the rating instrument to assess these dimensions, and the entire research team suggested multiple rounds of revisions. We finalized the instrument when no additional revisions were suggested (Appendix 3).

We randomly selected 135 eConsults for double rating and found a high degree of interrater agreement in each of the five dimensions: (a) elicitation of information from PCPs (87.5%); (b) adherence to institutional guidelines (68.4%); (c) agreement with peer's medical decisionmaking (94.0%); (d) educational value (88.9%); and (e) relationship building (98.0%) (eTable1).

The rating instrument was integrated directly into the eConsult platform, along with a list of other specialists' eConsults to be rated displayed in a new section below each specialist's own list of pending eConsult requests. The eConsult software module collected rating responses while allowing raters to remain anonymous. Specialists were provided with guidance on the rating process (see Quick Guide; Appendix 4), and estimated spending 5 min to rate each eConsult. eConsults were added to the task list weekly, randomly drawing from the baseline period (ranging from 6 to 11 months in length depending on study arm; eFigure 1) of eConsults and assigned to specialists from the same discipline. By including eConsults that occurred prior to the intervention, we established baseline performance for rated specialists. Structured data about consultants' identities and dates of service were masked to minimize bias.

Past studies have successfully nudged providers to improve decisions by providing them with information about their performance on standardized quality metrics, such as guideline adherence, relative to top-performing peers, labeling recipients explicitly as either a "Top Performer" or "Not Top Performer." 22, 27 However, this technique has not yet been evaluated using structured subjective peer ratings of performance. Providers in the feedback arms were sent messages that either announced their membership in an elite group of "Top Performers" or provided actionable re com mend atio ns with fe ed ba ck fo r " Not Top Performers."

"Top Performers" were those with peer ratings in the top tenth, including ties. Importantly, the phrases "Top Performer" or "Not Top Performer" were included in the email subject line. The body of the email included ratings of the recipient on each dimension, ratings of topperforming peers, links to rated eConsults for reference, and suggestions for improvement when relevant (feedback templates; Appendix 5). All messages were sent from an executive physician in the health system (PG).

The outcomes of interest were changes in ratings in each performance dimension before versus after feedback was received. As a secondary outcome, we analyzed improvements in rates of consultation in which a resolution was reached without a face-to-face specialist visit-a commonly used measure of eConsult effectiveness. [3] [4] [5] Inclusion/Exclusion Criteria All clinicians in all specialties regularly using eConsult were included; podiatry and surgery were excluded.

After adjusting effective sample size to accommodate an intracluster correlation of 0.055, 34 a 1-point increase in rating was detectable with 80% power (β=0.80) and α=0.05 with 24 specialists and a total of 81 ratings per arm.

Specialists were randomly allocated to the feedback intervention 2 3 or control (no feedback) 1 3 . To minimize contamination, specialties affiliated with different facilities were grouped for assignment to study arms, resulting in 80 facilityspecialty clusters that were randomized. It was not possible to blind specialists to the intervention assignment. Specialists were blinded to rater identity; structured data about consultants' identities were removed from the eConsult rating interface. Each eConsult was randomly assigned within specialty. This allocation allowed us to measure the extent to which rating other specialists' eConsults affected raters' own performance.

We used mixed-effects ordinal logistic regressions to estimate the impact of treatment. In each regression, the consult's rating on each dimension was used as the dependent variable, with independent variables for treatment group assignment at the time of the consult, time relative to the start of the trial, and the rated consultant's history of rating others at the time of the consult. Random effects addressed repeated measures, variation across specialties, and nesting of both specialists within specialty and eConsults within specialists. This allowed us to estimate the effects of the feedback intervention on each dimension of eConsult performance, adjusting for secular trends in ratings over time, participation in the peer rating process, specialty, and specialist-level random-effects.

All study procedures were approved by the University of Southern California Institutional Review Board prior to commencement. Figure 1 shows the flow of participants. Of the originally randomized 214 specialists, 64 were excluded from the analysis for lack of ratings, including one withdrawal. Table 1 shows the distribution of characteristics of participants by arm, including specialty. Table 2 shows the baseline performance of specialists in each arm, number of PCPs per specialist, proportion of consults in each rating category which resolved without a face-to-face specialist visit, and odds that rating improvements in each dimension improved resolution outcomes.

Intervention and rating activities were staggered to minimize interaction among participants from different clusters and account for effects of rating other specialists' eConsults. The first ratings were assigned in March 2017; the first feedback was delivered in August 2017.

We are unable to measure whether specialists opened the feedback emails; our intent-to-treat analysis analyzes all persons assigned to feedback. The optionality of rating and random assignment patterns generated an uneven number of completed reviews across the 144 specialists who were assigned ratings; 45 raters completed at least one assigned review.

We analyzed 2190 eConsults completed during the preintervention period ( Table 2 ) and 1064 in the intervention period, for a total of 3254 ratings. Ratings were not solicited if the reviewer deemed the consult to be administrative rather than clinical; 92% of PCP's inquiries were categorized as clinical (i.e., not administrative). If the PCP's initial questions required additional information gathering (760 consults, 23.4%), the specialists' effectiveness in eliciting additional information was rated. If the rater judged that institutional guidelines applied (1189 consults, 36.5%), the consultants' adherence to the recommendation was rated. If no guideline applied, the subjective agreement with medical decisionmaking was rated instead (2065 consults, 63.5%). If the rater reported that the PCP question presented an educational opportunity for the PCP (1441 consults, 44.3%), the educational value of the specialist's response was rated. All eConsults included in the analysis were rated on whether the specialists' response to the PCP would cause their professional relationship to worsen, remain the same, or improve. Receiving normative feedback from peers improved performance on four of the five dimensions nominated for evaluation during the instrument design phase. Table 3 and Figure 2 shows the adjusted odds that performance improves after the rating and feedback intervention. Receiving feedback improved performance in all rated dimensions with the exception of elicitation of information from PCPs, with significant improvement in three of the four improved domains.

Rating other specialists' eConsults improved raters' own performance on two performance dimensions: expert elicitation (OR 1.86; SD 1.04-3.35) and relationship building (OR 1.44; SD 1.01-2.06).

Across all performance dimensions, higher baseline ratings were associated with PCPs resolving the case without face-toface specialist visits; lower rates of face-to-face visits were significantly associated with information solicitation, educational value, and relationship building ( Table 2) . Feedback improved resolution outcomes, but adjusted reductions were not significant (eTable2).

This RCT shows that specialists can become more effective electronic consultants with feedback, significantly improving on ratings of medical decision-making, education, and relationship building. Most prior studies of consultation quality have used blunt instruments like data transfer or specialist utilization. 35, 36 Several previous studies have successfully employed feedback using standardized measures in other clinical domains, 22, 26, 27 but feedback based on peers' ratings has not been explored.

Our strategy of comparing specialists to "Top Performers" leverages aspirational social norms that may be particularly motivating to professionals who identify with high standards. First, while people naturally tend to adhere to perceived social norms, highlighting the top tenth rather than average performance sets a high but achievable bar. Moreover, "Top Performer" status signals an injunctive norm without disclosing practitioners' precise position in a distribution of peers, preventing regression among top performers. 23 Second, consultation performance does not lend itself easily to objective measurement in the same way that metrics of productivity or prescribing do. Ratings from peer physicians (particularly specialists) familiar with the patient population and local practice standards may confer greater credibility than standard metrics. Furthermore, associations between our ratings and Bold statistics reflect odds ratio (95% CI) that the PCP will resolve without a face-to-face specialist visit (+p<0.1, **p<0.05, ***p<0.001), adjusted for specialist random effects ⇞ Adjusted proportion of consultation in each rating level that was resolved without a face-to-face visit eConsult outcomes used in prior studies affirm the predictive validity of the ratings. 4, 9 The results we observe suggest ratings from peers are not dismissed by practitioners. Performance significantly improves after feedback, and this effect remains after accounting for participation in the rating process and associated observer effects. We show this type of information can be marshalled and framed as feedback to encourage specialists to promote relationships with PCPs that advance evidence-based practices with educational value.

Not all improvements were statistically significant, perhaps for interpretable reasons that can inform future design. In particular, the effect of feedback on concordance with institutional guidelines was not significant. Tellingly, interrater agreement on whether an institutional guideline applied was lowest among our measures; this variability suggests knowledge of guidelines among specialists was imperfect. Additionally, we noted during our interviews that some specialists had never entertained the idea that institutional guidelines might apply to their own decision-making. Another promising result was that participating in the rating process itself appears to independently improve the rater's own performance, but rating, an optional task, had uneven uptake. A larger sample may also allow further exploration of heterogeneity of effects, particularly across specialties. Another optimistic finding is that the intervention, which did not explicitly address referrals, reduced the need for face-to-face specialist encounters compared to controls. While some improvements were not significant with this sample size, it suggests that, at scale, this nudge may increase PCP capacity, potentially outweighing costs.

Future work may shed more light on the impact and mechanisms behind this intervention. For example, we might investigate the long-term impact on utilization, guideline adherence, and patient outcomes. We did not perform content analysis after development and validation of the rating instrument. A systematic coding and abstraction from consult text may explain which features of the communication are associated with ratings. These types of follow-up studies may reveal which of the rated dimensions are most important for improving patient and professional outcomes, optimize the instrument, and provide more specific guidance to specialists.

While our trial was conducted in a large system with diverse facilities and participants, conclusions about generalizability require additional research. Not all health systems have an incentive to facilitate communications between PCPs and specialists. However, COVID-19 has rapidly increased the importance of remote communication, and recently introduced Medicare benefits make eConsults more accessible as a means Figure 2 Adjusted odds of improvement after feedback. The odds ratio for improvement for each rated dimension after feedback.

to improve care coordination, 37 increasing the potential impact. This intervention may require adaptation in other health systems to ensure rating instruments and rating processes are consistent with institutional culture and goals. Environments with different status relationships among specialists might impact the perceived credibility and impact of ratingsfor example, a senior surgeon may not consider a junior cospecialist a "peer" in all settings. As with other processbased measures of clinical quality, patient outcomes may be difficult to attribute to observed improvements. Additionally, by design, some of the specialists in the "No Feedback" arm performed some ratings; this might have attenuated the observed effect size. To our knowledge, this is the first time peer rating and feedback have been used to evaluate and improve specialist communication. Because our study involved specialist peer ratings, having the referring PCPs rate the eConsults with a parallel feedback intervention might be an important area for comparative research. Finally, face-to-face visit as a proxy for outcomes does not fully capture the extent to which eConsult outcomes are dependent on cases, specialists, and specialty. While our analysis controls for temporal trends and random effects at both specialty and specialist level (both significant), the cluster-randomized design did not generate balanced allocation of specialties across arms-we cannot completely rule out unobserved confounders at specialist and specialty levels.

Adding responsibility for rating to already demanding specialist practices workload might meet resistance; scaling and standardizing this model may be challenging. Given the expense of specialist's time, costs of participation in rating should be balanced against the short-and long-term values. Differences between raters in the same specialty may suggest that time devoted to calibration is required. While this approach is far less costly than previous intraprofessional training programs, 18 the cost-effectiveness of the rating system we tested is unknown. Larger-scale studies may be needed to determine under what circumstances these interventions are comparatively cost-effective at a system and societal level.

Our study predates the pandemic, when virtually all specialty visits were in person. Trends show that specialists did not adopt telehealth at comparable rates to PCPs in 2020, and have more quickly resumed in-person practice to near-2019 levels. 38,39 Reducing referrals will continue to be an important way to limit in-person contact, and eConsult service providers have developed resources for optimizing use of eConsults for this purpose. 40 Additional research is needed to understand impact on specialist visit referrals across modalities, including how best to tailor selection of in-person vs. eConsult.

Using peer ratings, the quality of specialists' eConsults can be measured and improved by informing specialists of their performance compared to their top-performing colleagues. Funding This research was supported by the Blue Shield of California Foundation (Grant #16398089). Dr. Linder is supported in part by a contract from the Agency for Healthcare Research and Quality (HHSP233201500020I) and grants from the National Institute on Aging (including P30AG059988) and the Peterson Center on Healthcare.

The funding source had no role in the design of this study nor any role during its execution, analyses, interpretation of the data, or decision to submit results.

Transforming Specialty Practice -The Patient-Centered Medical Neighborhood

Coordination of specialty referrals and physician satisfaction with referral care

Impact of and Satisfaction with a New eConsult Service: A Mixed Methods Study of Primary Care Providers

Improving Access to Gastroenterologist Using eConsultation: A Way to Potentially Shorten Wait Times

Keeping care connected: e-Consultation program improves access to nephrology care

Adoption and impact of an eConsult system in a fee-for-service setting

Primary Care Practitioners' Perceptions of Electronic Consult Systems: A Qualitative Analysis

E-referral Solutions: Successful Experiences, Key Features and Challenges-a Systematic Review

Los Angeles Safety-Net Program eConsult System Was Rapidly Adopted And Decreased Wait Times To See Specialists

eReferral -A New Model for Integrated Care

Hospital outpatient clinics as a potential hazard for healthcare associated infections

Colleagues Unknown -How Peer Evaluation Could Enhance the Referral Process

Primary care physicians' links to other physicians through Medicare patients: the scope of care coordination

How to write a psychiatric consultation

What makes a high-quality electronic consultation (eConsult)? A nominal group study

Referral and consultation communication between primary care and specialist physicians: finding common ground

eConsults and Learning Between Primary Care Providers and Specialists

Learning intraprofessional collaboration by participating in a consultation programme: what and how did primary and secondary care trainees learn

Intraprofessional collaboration and learning between specialists and general practitioners during postgraduate training: a qualitative study

Nudge: Improving Decisions about Health, Wealth and Happiness

A Room with a Viewpoint: Using Social Norms to Motivate Environmental Conservation in Hotels

Improving quality improvement using achievable benchmarks for physician feedback: a randomized controlled trial

The Constructive, Destructive, and Reconstructive Power of Social Norms

Computer-Mediated Communication: Impersonal, Interpersonal, and Hyperpersonal Interaction

A largescale field experiment shows giving advice improves academic outcomes for the advisor

Closing the Productivity Gap: Improving Worker Productivity Through Public Relative Performance Feedback and Validation of Best Practices

Effect of Behavioral Interventions on Inappropriate Antibiotic Prescribing Among Primary Care Practices: A Randomized Clinical Trial

Real-time Feedback in Pay-for-Performance: Does More Information Lead to Improvement?

Physicians' Trust in One Another

To be and not to be: Lifestyle imagery, reference groups, and the clustering of America

Where Consumers Diverge from Others: Identity Signaling and Product Domains

Whatever happened to qualitative description?

Development and Implementation of Expected Practices to Reduce Inappropriate Variations in Clinical Practice

Design and Analysis of Cluster Randomization Trials in Health Research

Closing the Referral Loop: Improving Ambulatory Referral Management, Electronic Health Record Connectivity, and Care Coordination Processes

Closing the Referral Loop: an Analysis of Primary Care Referrals to Specialists in a Large Health System

Medicare's Approach to Paying for Services That Promote Coordinated Care

The Impact of the COVID-19 Pandemic on Outpatient Care: Visits Return to Prepandemic Levels, but Not for All Providers and Patients

COVID-19 E-Consult Resources

Hal Yee is an advisor to RubiconMD. All other authors declare that they do not have a conflict of interest.