key: cord-0730609-ecf39yzp
authors: Higgins, Matthew J.; Yan, Xin; Chatterjee, Chirantan
title: Unpacking the effects of adverse regulatory events: Evidence from pharmaceutical relabeling
date: 2020-09-12
journal: Res Policy
DOI: 10.1016/j.respol.2020.104126
sha: a763905fd23180331c3cfdcfcf9f34dc5d94b6c2
doc_id: 730609
cord_uid: ecf39yzp

We provide causal evidence that regulation induced product shocks significantly impact aggregate demand and firm performance in pharmaceutical markets. Event study results suggest an average loss between $569 million and $882 million. Affected products lose, on average, $186 million over their remaining effective patent life. This leaves a loss of between $383 million and $696 million attributable to declines in future innovation. Our findings complement research that shows drugs receiving expedited review are more likely to suffer from regulation induced product shocks. Thus, it appears we may be trading off quicker access to drugs today for less innovation tomorrow. Results remain robust to variation across types of relabeling, market sizes, and levels of competition.

The role that downstream market demand plays on upstream innovation has long been recognized in the literature (e.g., Schumpeter, 1942; Griliches and Schmookler, 1963; Schmookler, 1966; Nordhaus, 1969) . More recent work has linked R&D intensity and demand (Pakes and Schankerman, 1984) , market entry and expected revenues (Scott Morton, 1999; Reiffen and Ward, 2005) , market size and innovation (Acemoglu and Linn, 2004; Finkelstein, 2004; Dubois et al., 2015) , public procurement and innovation (Edler and Georghiou, 2007) and societal demands and research priorities (Ciarli and Rafols, 2019) . Another strand of literature has considered the interplay between various demand-side and supply-side factors and their impact on innovation (Peters et al., 2012; Kalcheva et al., 2018) .

The above stream is enriched by scholars focusing on shocks to demand and its resulting impact on innovation. Recent work by Manso et al. (2019) finds large positive demand shocks generate more R&D, however it tends to be more incremental than radical innovation. Several papers find evidence that the passage of Medicare Part D increased innovation of drugs targeting conditions prevalent among elderly patients (Blume-Kohout and Sood, 2013; Dranove et al., 2014) . Using the same identification strategy, Hermosilla and Wu (2018) demonstrate the impact on external technology markets; downstream commercializers increased their rate of licensing from upstream innovators. Finally, at a more macro-level, shocks to aggregate demand have been shown to impact investments in innovation capacities (Paunov, 2012; Armand and Mendi, 2018) .

Given this link between downstream market demand and upstream innovation, understanding how exogenous product shocks influence demand in the first place itself is critical for firms and policymakers. Herein, one important and understudied source of demand-side shocks is regulation. Regulation plays an important role in protecting consumers but regulation can also impede firms (Aghion et al., 2019) and markets. For example, regulation can restrict firms' freedom of actions (Palmer et al., 1995) and slow the diffusion of new technologies (e.g., Joskow, 1981) . If these demand shocks are significant enough, we would expect them to negatively impact current firm performance and potentially dampen future innovation (Ball et al., 2018) .

Our study uses novel data to examine the impacts of safety-related regulatory product shocks in the pharmaceutical industry on aggregate demand and firm performance. The drug development process is long and expensive with a low probability of receiving Food and Drug Administration (FDA) approval (Wong et al., 2018) . 1 As part of the approval process drug candidates undergo clinical trials designed to test their safety and efficacy. In the post-approval period, the FDA maintains a surveillance program that continues to monitor drug safety. The FDA Adverse Events Reporting System (FAERS) database was designed to collect complaints and adverse events for approved drugs. Depending on the situation and severity of the data collected, the FDA will act and move to change the safety label associated with a drug (known as 'relabeling').

While prior studies have focused on various types of relabeling (e.g., Macher and Wade, 2016; Qureshi et al., 2011; Dorsey et al., 2010) , most have limited their analyses to a single or small number of therapeutic markets (e.g., Jacoby et al., 2005) . These studies are important because we learn about the intricacies and nuances of specific markets but we are unable to draw conclusions about the overall impact of relabeling across markets. In contrast, using a dataset of all drugs sold across all therapeutic markets in the U.S. and U.K. we use a difference-in-differences (diff-in-diffs) specification to provide causal evidence relating the impacts of FDA drug relabeling on aggregate consumer demand and firm performance. 2 We find that, on average, aggregate demand declines by 16.9% within two years of a relabeling event. Our data allows us to capture intra-and inter-market substitution patterns as well as competitive responses. Critically, after accounting for these factors we still find that aggregate demand declined by 4.7%, an estimate that represents consumers that prematurely leave the market.

Next, we explore the variation across the severity of relabeling events. Not unexpectedly, we find an increasing aggregate demand response as relabeling severity increases, ranging from a decline of 15.6% for the least severe to a decline of 36.3% for the most severe. After accounting for all plausible substitution patterns we find declines in aggregate demand ranging from 4.0% for the least severe to 8.3% for the most severe relabel. This pattern, however, is not homogenously distributed across all markets. When we focus on the variation in relabeling activity across individual markets, interesting patterns begin to emerge. In "low-intensity markets" or those with low levels of relabeling activity we find declines in drug aggregate demand are completely absorbed by intra-market substitution. In contrast, in "highintensity markets" or those with high levels of relabeling activity, after accounting for plausible substitution patterns, aggregate demand declines by 5.0%.

Our findings have implications for firms. First, our results suggest that current efforts by firms to counteract the impacts from these negative regulatory shocks, on average, appear to be failing. Importantly, the magnitude of our results for relatively minor safety relabeling suggests that physicians may be proactively shifting consumers to other drugs (or competitors are successfully "pulling" consumers away). This implies that while detailing (i.e., direct advertising to physicians) may be effective at influencing initial physician prescription behavior (e.g., Datta and Dave, 2017) , this influence appears to break down when confronted with negative safety information. Unfortunately, while we can detect the shift in behavior, we can only conjecture on the underlying motivations driving physician behavior. 3 Given the prior literature, these effects may not be isolated to just downstream aggregate demand but may also extend upstream. In a recent paper, Krieger et al. (2018) explore how pharmaceutical firms react to negative shocks to existing products. They show that affected firms increase R&D expenditures but those expenditures are more likely to go towards the acquisition of new pipeline candidates versus internally developed candidates. 4 Importantly, they also show that competitors move resources away from affected therapeutic categories, reshuffling their own drug portfolios. Our findings and those of Krieger et al. (2018) are intimately linked; we provide evidence of the initial downstream aggregate demand impacts from negative regulatory shocks while they provide evidence of subsequent upstream innovation changes. 5 In an effort to estimate the economic losses from these regulatory shocks on firm performance, we conduct an event study. The advantage of the event study methodology in this instance is that it will capture not only the effects from the unanticipated loss in revenues from the affected product but also loss in value from declines in future innovation. Results across two different standard event windows translate into an average loss in the range of $569 million to $882 million. Back of the envelope calculations suggest that the unanticipated loss in revenues from affected products averages $186 million over their remaining effective patent life. This leaves an unanticipated loss of between $383 million and $696 million, on average, attributable to declines in future innovation. These results, along with Krieger et al. (2018) , support the notion that downstream regulatory shocks have significant impacts on upstream innovation.

Finally, our findings contribute empirical evidence to the economics of regulation literature dating back to Brown et al. (1964 ), Nelson (1970 and Stigler (1971) . The breadth and depth of our data allow for a unique analysis of aggregate demand that captures all plausible substitution patterns. Regulation is rarely without cost, as is the case here. 1 Wong et al. (2018) place the probability of a drug candidate reaching FDA approval at 13.8%.

2 For ease of exposition we use the term aggregate consumer demand interchangeably with demand. To be precise we are referring to aggregate consumer demand. Our data is at the standard unit level and not at the individual prescription level. Standard units are determined by IMS Health and are intended to equate pills, tablets and liquids.

3 Explanations range from physicians practicing "defensive medicine", being concerned that less serious safety concerns will eventually unmask more serious concerns ("where there is smoke, there is fire"), lack of adequate information, or being induced by competitor firm detailing efforts (Macher and Wade, 2016) . Current work by the authors involve a large-scale survey with a national association of physicians to understand what drives prescription changes in the face of negative safety-related information. Preliminary, qualitative evidence seems to suggest some combination of defensive medicine and marketing efforts by competitors -consist with those described in Macher and Wade (2016) . 4 The relationship between product or pipeline shocks and subsequent technology acquisition in the pharmaceutical industry was previously considered in Higgins and Rodriguez (2006) , Danzon et al. (2005) and Chan et al. (2007) . The importance of Krieger et al. (2018) , however, is they provide causal evidence of this relationship. 5 The linkage of downstream product shocks and upstream innovation has been explored in other contexts. For example, Ball et al. (2018) examines the upstream innovation impacts as a result of downstream medical device recalls.

Importantly, this is a market that is under immense time constraints given the limited nature of patent protection. Regulators therefore face a tension between length of trials and getting new drugs to market. 6 Into this mix the FDA has developed pathways for expedited development and review including priority review, breakthrough therapy, accelerated approval and fast track. 7 Recent evidence suggests, however, that there has been an increase in safety label changes for drugs that have moved through some form of expedited pathway (Mostaghim et al., 2017; Moore and Furberg, 2014; Carpenter et al., 2008; Olson, 2008) . These label changes are not trivial; Mostaghim et al. (2017) report a doubling of the most severe types of relabel for expedited drugs relative to non-expedited drugs. With impacts from safety label changes rippling both downstream and upstream, it suggests that regulators may have tipped the balance too far towards getting new drugs to market. More fundamentally, our results combined with those of Krieger et al. (2018) suggest that we may be trading off quicker access to new drugs today for less innovation tomorrow. 8 The remainder of the paper is organized as follows. In Section 2 we discuss the FDA drug relabeling process and in Section 3. we focus on adverse regulatory events. This is followed by our empirical strategy and data in Section 4. Results and robustness are reported in Sections 5 and 6.0, respectively, before we conclude in Section 7.

The pharmaceutical industry in the U.S. is highly regulated and drug candidates undergo rigorous clinical testing prior to being submitted to the FDA for approval. During this process possible risks and side effects of a drug candidate are identified. This information becomes part of the FDA approved label and drug insert that accompanies a newly approved drug. Unfortunately, some side effects do not become known until after a drug has been approved. To help with the reporting and collection of these adverse events the FDA founded MedWatch in 1993. Healthcare professionals or consumers (patients) can voluntarily report to Medwatch. In more recent times this data on adverse events has been made available via FAERS. 9 During the post-approval time period the FDA monitors adverse reporting along with results from post-approval studies and peerreviewed literature. Negative safety-related information is scrutinized and the FDA can form an investigation team to determine if a safety label update is needed. If they believe a safety label change is warranted the manufacturer is notified and is required to report back to the FDA within a predetermined period. The agency works privately with a manufacturer to determine which type of safety label change will be made. 10 At the end of the process the FDA will publish this information online while allowing firms additional time to change actual printed material. 11 Prior to 2016 product safety data was available via MedWatch but has since shifted to the FDA Drug Safety Label Change database.

The main safety labeling changes that the FDA issues include: adverse reaction, precaution, warning, contraindication, and box warning. 12 These classifications serve to inform physicians and consumers of possible health concerns that have been clinically identified, anticipated to occur, or associated with unapproved uses. A drug that has been relabeled can undergo additional safety label changes in the future, if warranted. The box or "black box" warning is the most severe of type of label change and is intended to communicate potentially severe health risks resulting from taking a drug.

While the FDA's procedure and process for a drug safety relabel is well established, there is no guarantee that the updated information will be read by physicians or consumers. In a world of perfect-information we might assume that this new information will be read, however, the evidence seems to suggest otherwise. One form of communication that firms use to convey new safety information, the "Dear Doctor letter" (DDL) was found in 28% of cases to be deficient in their overall level of effectiveness (Mazor et al., 2005) . 13 Other studies have shown that fewer than one in ten physicians routinely read drug labels. 14 Similarly, Hoy and Levenshus (2018) find that consumers routinely ignore safety related information. Macher and Wade (2016) shed an important light on the underlying mechanism of how physicians may be learning about safety label changes. In the case of black-box warnings, they find that affected firms themselves may increase physician detailing (i.e., direct-to-physician advertising) but they also find that competitor firms also increase detailing efforts. So while affected firms may try to actively deal with the problem, it does appear that competitors take advantage of the opportunity and try to pull physicians to their products. While this study is limited to black-box warnings, there is no reason to believe that there wouldn't some type of similar response across the spectrum of relabeling activity.

There appears to be some hope in that new technology may be able to help with this information asymmetry. In a recent paper, Arrow et al. (2020) show that physicians with access to a drug reference database changed their prescribing behavior. While this study focused on the shift from branded to generic drugs, they suggest that physicians may be responding to non-clinical information such as whether a drug is covered by a consumer's insurance plan or plan-specific drug pricing. The database in this study included FDA drug safety information and alerts but the information was not explicitly analyzed. However, given the fact that physicians were taking the time to interact with this type of technology does suggest that it may be an effective mechanism to deliver safety relabel information. A major concern with new technology is ensuring that it diffuses out to physicians, especially those practicing in non-academic or rural settings. 15

We draw on several strands of literature starting with the economics of regulation. Early work in this area theorizes on the impact of regulation on consumer and firm behavior (e.g., Stigler, 1971; Peltzman, 1976; Migue, 1977) . Brown et al. (1964) argued that regulation could be viewed as an information transmission process. As consumers receive 6 In response to COVID-19, there has been immense pressure to speed the trials of Remdesivir®: https://www.nytimes.com/2020/05/02/us/politics/va ccines-coronavirus-research.html 7 https://www.fda.gov/forpatients/approvals/fast/default.htm. These are coupled with other initiatives such as the 21 st Century Cures Act. 8 We must include the caveat that while there may be less innovation tomorrow, we cannot say anything about the type or novelty of the lost innovation. In a Health Affairs blog post, Pranav Aurora and authors conjectured about possible innovation implications from priority review vouchers. https:// www.healthaffairs.org/do/10.1377/hblog20160615.055372/abs/ 9 http://www.nber.org/data/fda-adverse-event-reporting-system-faers-data.

html. 10 A firm may know that a relabel will impact demand and firm performance and as such they have the incentive to 'drag their feet'. The extent this is possible or occurs is unknown but remains a possibility. 11 https://www.fda.gov/downloads/drugs/guidancecomplianceregulatoryi nformation/guidances/ucm250783.pdf 12 Guidance for industry is provided by: https://www.fda.gov/downloads/d rugs/guidances/ucm075096.pdf 13 Attempts to improve the effectiveness of labels is on-going with draft guidance as recently as 9 July 2018 intended to assist applicants in writing drug labels: https://www.policymed.com/2018/07/fda-releases-draft-guidance -on-indications-and-usage-labeling-sections.html 14 https://www.nytimes.com/2006/01/19/us/new-drug-label-rule-is-intende d-to-reduce-medical-errors.html 15 Worryingly, in a recent study, electronic health records commonly used in hospitals nationwide failed to detect up to one in three potentially harmful drug interactions and other medication errors (Classen et al., 2020) . new information they are able to update and change their behavior. Subsequent work built on this idea to show how information influences consumer perception of product quality (Zeithaml, 1988) and how behavior changes with positive information (Nelson, 1970) . In contrast, Hartley (1994) showed how negative product information led to decreased sales. More broadly, Oberholzer-Gee and Mitsunari (2006) examined how non-related negative events, in their case the release of pollution information, decreased property values. In our context, the process of relabeling in the pharmaceutical industry can be viewed as an information transmission process that subsequently impacts behavior.

Our paper also draws on studies in healthcare where scholars have explored the implementation of regulatory procedures on public health (e.g., Gruenspecht and Lave, 2006) . In the case of the pharmaceutical industry, Dranove (2011) stresses the importance of quality certification for efficient and optimal regulation. For drugs, this certification comes in the form of the FDA approval process. This process can be divided into pre-and post-approval stages. The pre-approval stage includes clinical testing and provides the first line of defense to ensure safety and efficacy of products. This creates a tension, however, for regulators between length of trials and getting new drugs to market. For example, adverse events have been increasing (Moore et al., 2007) and have been associated with declines in pre-approval times (Olson, 2003) . This makes post-approval safety monitoring critically important. In recent times FAERS has served as an important source of data for updating safety labeling information (Wysowski and Swartz, 2005) .

Based on these prior studies, adverse safety related information should, on the margin, improve physician awareness about the potential safety of a drug and lead to changes in behavior. Presumably physicians (or consumers) may shift away from a drug given a safety concern. A number of studies focused on specific therapeutic markets support this association (e.g., Dranove and Olsen, 1994; Smalley et al., 2000; Cheah et al., 2007; Tekin and Markowitz, 2008; Bunniran et al., 2009; Dorsey et al., 2010; Kales et al., 2011; Dusetzina et al., 2012; Briesacher et al., 2013; Lu et al., 2014) . Prior work also documents that this association could be differential; new drugs tend be impacted more than existing drugs (Wilkinson et al., 2004) and geographic variation could cause the usage of a warned drug to be different (Shah et al., 2010) .

What remains unknown from this batch of prior work is what constitutes a rational, medically appropriate response? One might expect physicians to switch some consumers to other drugs as the severity of relabeling events increase but are all of these changes medically appropriate or is there some other underlying motivation driving the switch? Are physicians being influenced by detailing as suggested by Macher and Wade (2016) , anticipating future problems (i. e., "where there is smoke, there is fire"), practicing defensive medicine or responding to consumer concerns? Similarly, do consumers just seamlessly switch to different drugs or might they decide to stop treatment altogether and leave the market? Prospect theory (Tversky and Kahneman, 1992; Kahneman and Tverksy, 1979 ) provides a behavioral explanation as to why physicians may switch consumers to other drugs (Verma et al., 2014) . While a safety relabel is serious it is not necessarily relevant for all consumers, in all situations. However, physicians may overestimate the probability of a negative event occurring an incorrectly switch a consumer to another drug. Prospect theory can also help explain why consumers may ultimately choose to leave the market. When confronted with new information about a drug from their physicians, consumers may also vastly overestimate the probability of a negative event. As a result, they may incorrectly attribute these same negative effects to substitute drugs that physicians prescribe. If consumers make this link they may incorrectly conclude that the benefit of a new drug does not outweigh the risk and exit the market.

There is experimental evidence that supports these negative responses by consumers. For example, Bunniran et al. (2009) study trust and blame due to the withdrawal of pharmaceutical products as a result of safety related concerns. They found that consumers taking the withdrawn drug or those taking another drug within the same class were highly likely to blame pharmaceutical companies and the FDA. After an event trust in both institutions remained fairly low. These declines provide one plausible explanation as to why consumers may formulate and attribute the negative effects described by Tversky and Kahneman (1992) and Kahneman and Tverksy (1979) to an affected drug or a substitute. It also suggests that there are behavioral considerations at play that physicians (and regulators) need to consider when consumers get switched to a new drug due to safety-related concerns.

Regulation can also have unintended consequence from spillovers. For example, toy recalls due to safety reasons tend to cause negative industry-wide spillover effects for similar types of toys (Freedman et al., 2012) . In our setting, such spillovers would manifest in the drugs within the same market or related market as the affected drug that is relabeled due to a safety concern. Krieger et al. (2018) supports this notion and demonstrates that there are also innovation impacts to consider. They find a decline in the total number of drugs developed in an affected area, implying that these negative shocks may slow overall innovation in a given therapeutic category.

These results are concerning given the increase in drugs receiving some type of expedited review by the FDA. 16 On the one hand, it has been shown that drugs cleared via expedited review appear to offer greater quality-adjusted life years (QALYs) than those approved via conventional methods (Chambers et al., 2017) . It appears that the expedited review process has helped the FDA prioritize drugs that offer greater health gains (0.182 versus 0.003 QALYs). On the other hand, these approvals have come at a cost. The evidence appears to suggest that the drugs receiving some type of expedited review are more likely to receive some type of serious safety label change (e.g., Mostaghim et al., 2017; Moore and Furberg, 2014; Carpenter et al., 2008) .

We exploit FDA relabeling events to estimate a diff-in-diffs specification. As we discussed above, the relabeling process involves private interactions between the FDA and a firm and remains unknown to consumers and physicians prior to formal action. We use two groups of observations. The first group (treated) includes drugs sold in the U.S. Because FDA relabeling events only affects drugs sold in the U.S., our treated group is exposed to treatment in the post-relabel period but not in the pre-relabel period. The second group (control) is comprised of the same exact drugs as those in the treated group but sold in the U.K. 17 As such, we estimate the following model:

where Y i,t is demand (i.e., drug sales). Relabel i,t is a dummy variable for the post-treatment period represented by drug relabeling events and captures aggregate factors that would cause changes in Y i,t even in the absence of the treatment. US i is a dummy variable and captures possible differences between the treatment and control groups. We include a variety of controls, discussed below, as well as drug-level (μ i ) fixed effects to control for time-invariant heterogeneity between drugs and year fixed effects (δ t ) to control for common shocks impacting all drugs across time. This base specification is estimated at differing levels of aggregation so μ i will also represent therapeutic market fixed effects. The coefficient of interest across all models is β 3 and it represents the impact induced by drug relabeling events on U.S. drugs relative to U.K. drugs. Our identification strategy relies on the fact that the control group is not exposed to treatment in either period. Importantly, the FDA does not have regulatory jurisdiction over drugs sold in the U.K. 18 This can be visually shown in Fig. 1 where the pre-trends do not appear to violate the parallel trend assumption. To test the parallel trend assumption more formally we take our pre-trend data and split it in half, defining the midpoint as an arbitrary treatment event and estimating our diff-in-diffs specification. If the parallel trend assumption is violated the coefficient β 3 will be statistically significant. The results for this placebo test are reported in Table A1 . The coefficient of interest is not statistically significant across any model or level of analysis. Combined, the visual evidence along with these placebo test results suggest that the parallel trends assumption is not violated. Table 1 Distribution of relabel activity between the U.S. and U.K. Our sample consists of drugs sold both in the U.S. and U.K. In order to create a clean control window we excluded drugs if they were relabeled in the U.K. within eight quarters of a U.S. relabel. This table shows the variation in relabeling types across the U.S. and U.K. for our sample. Within the imposed restrictions the average elapsed time between relabeling in the U.S. and U.K. is 12.95 quarters. 16 These issues were discussed in a recent JAMA Forum post: https://newsatj ama.jama.com/2018/05/23/jama-forum-the-risks-and-benefits-of-expediteddrug-reviews/ 17 There are two ways to approach a control group given our identification strategy. First, we could try to find a matched sample of other drugs in the U.S. that were not 'treated' or did not experience a relabeling event in the same time period. One problem with this approach is that 'matches' will always be done with some error as they would need end up having significantly different mechanisms of action because of our intra-and inter-market substitution specifications. The second approach, and the one we took here, is the equivalent of the 'twin' studies in psychology and genetics (e.g., Polderman et al., 2015) . By using the same exact drug, with the same mechanism of action, we remove that potential error or bias from our study. Theoretically, the only bias from our approach would be if there were differences between how patients in the U.S. and U.K. metabolized a drug; but no evidence exists that suggests that this is the case. The variable US will pick up differences between the treated and control groups. 18 The U.K. was chosen for reasons of common language and legal frameworks. La Porta et al (2008) has shown that a country's laws are highly correlated with a broad range of its legal rules and regulations, as well as with economic outcomes. There also exists an extensive cross-cultural communications literature that suggests these issues are important. For example, there are negative effects of language, such as distortion, blockages and filtration, that have long been recognized in the literature (e.g., Bargiela-Chiappini and Nickerson, 2003) . Additionally, language can have a 'foreignness' attached to it that can act as a barrier and includes coded terms that are common within and between groups (Hedlund, 1999; Nahapiet and Ghoshal, 1998) . Ultimately, we are dealing with medical side-effects that are often reported in technical terms which will be common across the U.S. and U.K. By focusing on countries with a common language we are minimizing any bias that may be introduced from different terms, interpretations, or meanings of words.

Our sample consists of all drugs sold in both the U.S. and U.K. during 2003 to 2009 as identified by IMS MIDAS™. Relabeling data for drugs sold in the U.S. was collected from the FDA MedWatch database and we restricted the data to those drugs that experienced a first-instance of a drug relabel. 19 Relabeling data for drugs sold in the U.K. was gathered from Datapharm's electronic Medicines Compendium that covers all drugs approved by the U.K. Medicines and Healthcare Products Regulatory Agency (MHRA). 20 In order to create a clean control group we further restricted our treated drugs to include only those that experienced a relabel in the U.S. but no relabel in the U.K. within eight quarters of the U.S. relabel. Table 1 provides the distribution of relabel activity in the U.S. and U.K. For those drugs that were subsequently relabeled in the U.K. the average time until relabel was 12.95 months after the relabel event in the U.S. This was shorter than 18.5 months documented by Pfistermeister et al. (2013) for a limited sample of psychiatric drugs. 21 Importantly, we could find no evidence that drug relabeling in the U.S., on average, systematically impacted contemporaneous physician prescription patterns in the U.K. (Fig. 1) . This further validates our U.K. sample as a clean control for causal estimates in our study.

Next, we gathered quarterly drug-level sales, detailing (i.e., direct to physician promotions), and price data from IMS MIDAS™. Sales or quantity data is standardized by IMS into a 'standard unit' that equates pills, tablets and liquids. The data for both the U.S. and U.K. includes both hospital and retail channels. IMS MIDAS™ includes all branded and generic drugs and covers every therapeutic category. Detailing or directto-physician promotion data is available for all approved drugs. 22 Financial variables from the U.K. have been converted by IMS to U.S. dollars and all financial variables have been converted to real 2009 dollars using a GDP deflator. 23 Descriptive statistics are presented in Table 2 .

Note that drugs are approved for use within 4-digit anatomical therapeutic chemical (ATC) markets. The ATC classification is controlled by the World Health Organization and was designed to categorize drugs into different groups according to the organ or systems that they treat. 24 There are four different levels of classification ranging from the most aggregate (1-digit ATC) to most disaggregate (4-digit ATC). For example, the 1-digit ATC market N comprises drugs for the nervous system. Within ATC N there are seven 2-digit ATC markets that Fig. 2 . Mapping of ATC. This figure maps an example of the ATC therapeutic category N from the 1-digit (ATC 1) to 4-digit (ATC 4) level. Therapeutic category N represents the nervous system. This category has seven different 2digit ATC categories. Focusing on the 2digit category, N03 -Antiepileptics, it contains only one 3-digit ATC category, N03A Antiepileptics, which itself contains eight 4-digit ATC categories. The 4-digit ATC category, N03AC Oxazolidine derivatives, includes three different drugs: (1) paramethadione, (2) trimethadione, and (3) ethadione. As an example, assume that trimethadione undergoes a drug relabel. Our primary specification analyzes the direct effect on trimethadione. Intra-market substitution considers the extent to which the other drugs within N03AC absorb sales from trimethadione. Inter-market substitution considers sales across all 4digit ATC categories, N03AA -N03AX, within the same 3-digit ATC category, N03A. Drugs within the same 4-digit ATC category can be viewed as near perfect substitutes while drugs within the same 3-digit ATC category can be viewed as less perfect, but still medically viable, substitutes. Drugs across different 2-digit ATC therapeutic categories are not related for purposes of treatment. 19 It is possible to have multiple different types of relabeling activity at the same time. This is not a concern for our baseline models. However, when we examine the variation across types of relabeling activity we include those observations in each type of relabeling activity. We focus on four types of relabeling events: precaution, adverse reaction, warning and box warning. There was only one first-instance of a contraindication that met our sample criteria. It was excluded from the final sample; our results do not change with this exclusion. 20 https://www.medicines.org.uk/emc/ 21 In Table A2 we extend the time frame for our baseline model from eight quarters to 12 and 16 quarters; our results remain robust to these longer time frames. 22 Detailing is legal in the U.K. during our sample period. The Blue Guide:

Advertising and Promotion of Medicines in the UK (3 rd Edition, Sept 2014) published by MHRA outlines the regulations and processes for promoting branded drugs in the U.K. Ultimately, the Association of the British Pharmaceutical Industry (ABPI) sets the Code of Practice (contained in The Blue Guide) for drug promotion in the U.K. 23 It is critical to note that the price data within IMS MIDAS™ is a wholesale price. It does not include adjustments as a result of back-end rebate payments or any other discounts that may be offered to insurance or prescription benefit companies. 24 For a more detailed discussion: https://www.whocc.no/atc_ddd_method ology/purpose_of_the_atc_ddd_system/ contain 19 3-digit ATC markets. Each of these 3-digit ATC markets, in turn, contains 4-digit ATC markets. An advantage of our data is that it is available at the 4-digit ATC market level and can be aggregated as needed allowing us to capture intra-and inter-market substitution patterns. An example of the ATC structure across its multiple levels is mapped in Fig. 2. 

Our baseline dependent variable is drug sales (quantity) in standard units as determined by IMS. Sales are aggregated across varying dosages to the drug level since a relabeling event will impact the drug similarly across dosage types. We define Sales as the natural logarithm of quarterly drug sales plus one. In addition to the baseline drug level, we will consider two additional aggregate models. First, we consider sales of all drugs within a drug's 4-digit ATC market. These drugs can be reasonably viewed as close substitutes. For example, both anti-viral drugs Invirase Ⓡ and Norvir Ⓡ are contained in the 4-digit ATC market J5AE (protease inhibitors). Importantly, this aggregation allows us to capture intramarket substitution by physicians.

Second, we move up one more level of aggregation to the 3-digit ATC market. At this level of analysis we capture all drugs within multiple 4digit ATC markets but contained within the same 3-digit ATC market. 25 For example, the two 4-digit ATC markets J5AE (protease inhibitors) and J5AF (nucleotide reverse transcriptase inhibitors) are contained within the 3-digit ATC market J5A (direct acting antivirals). As a second example, the two 4-digit ATC markts N3AF (carboxamide derivatives) and N3AG (fatty acid derivatives) are contained within the 3-digit ATC market N3A (anti-epileptics). This level of aggregation allows us to capture inter-market substitution by physicians. 26

As indicated above, our sample includes drugs that were sold both in the U.S. and U.K. We define U.S. as a dummy variable that equals one if the drug was sold in the U.S., zero otherwise. In order to implement our diff-in-diffs strategy, we define a dummy variable (Relabel) that equals one for all observations after a drug's first relabeling event, zero otherwise. Relabel encompasses four types of events: precaution, adverse reaction, warning and box warning.

Prior work has demonstrated the importance of detailing on physician prescription behavior (e.g., Datta and Dave, 2017; Manchanda and Honka, 2005) and reducing price elasticity (Windmeijer et al., 2005; Rizzo, 1999) . However, contemporaneous detailing is a function of current sales, which can create a reverse causal relationship. To resolve this issue we use lagged promotion stock as studies have shown that promotions have a carry-over effect (e.g., Zhao et al., 2013) . Importantly, prior promotion expenditures should not be impacted by contemporaneous sales. As such we define Lagged promotion stock as the discounted sum of the prior three quarters detailing expenditures. We follow the literature (Leone, 1995) and use a 70% discount rate, however our baseline results are not sensitive to inclusion or variation in the discount rate. 27 Focusing on black-box warnings in the Type-2 diabetes market Macher and Wade (2016) found that affected firms took strategic actions with respect to promotions to mitigate losses from the relabeling event. Lagged promotion stock in the drug-level models will capture these effects. They also found that competitors take advantage of these adverse events by increasing promotion activity in order to try to steal market share. Lagged promotion stock in the aggregated models at the 4-digit and 3-digit ATC market-level will control for these competitive dynamics. These latter two models will also capture and control for any affected firm promotion response.

Next, we control for several drug and market characteristics that may influence sales or demand. First, we define Vintage as a measure of elapsed time, in quarters, from introduction to the time of relabel. Drugs that have been on the market longer have time to build up brand loyalties with consumers and physicians even though they may become 'outdated' as newer treatments come to market. Finally, we include count variables for the Number of brands and Number of generics. The former controls for the intra-market substitution possibilities. The latter controls for cross-molecular substitution or the insurance company's ability to attempt to influence physicians to switch patients to a generic of another branded drug within the same therapeutic market (Branstetter et al., 2016 (Branstetter et al., , 2014 Castanheira et al., 2019) .

As indicated above, for those drugs that have multiple dosages sold by the same firm we aggregate the data together to the drug-level. We define Price by dividing drug-level revenues by the quantity of drugs reasonable substitution patterns, on average, we aggregate markets up one more level to the 2-digit ATC market. At this level of aggregation we capture all 3-digit ATC markets contained within a 2-digit ATC market. Each of those 3digit ATC markets will include 4-digit ATC markets. For example, let's consider the 2-digit ATC market J04 (antimycobaterials). It contains two 3-digit ATC markets, J04A (drugs for treatment of tuberculosis) and J04B (drugs for treatment of lepra). The 3-digit ATC market J04A contains six 4-digit ATC markets: J04AA (aminosalicylic acid and derivatives), J04AB (antibiotics), J04AC (hydrazides), J04AD (thiocarbamide derivatives), J04AK (other drugs for the treatment of tuberculosis), and J04AM (combinations of drugs for the treatment of tuberculosis). The 3-digit ATC market J04B contains one 4-digit ATC market, J04BA (drugs for the treatment of lepra). Like our 3-digit ATC market level of analysis this 2-digit ATC market level of analysis can also be viewed as capturing inter-market substitution. 27 Following Leone (1995) we vary the discount rate between 50 and 70%.

sold. It is important to note that we are capturing wholesale price and this does not include any unmeasured discounting (rebates) by pharmaceutical companies, which is not currently commercially available. This price variable, however, will be highly correlated with ultimate consumer price and as such will be endogenous. 28 To address this concern we follow the literature (e.g., Nevo, 2001) and use the mean and median price of other drugs in closely related markets as instruments for the drug's price. Specifically, we use the mean and median price of other drugs within the same 2-digit ATC market. For example, if our affected drug is a MAO-inhibitor (4-digit ATC market C02KC) we take the mean and median price of drugs in the broader 2-digit ATC market, C02 (anti-hypertensives). Drugs within the same 2-digit ATC should, on average, be correlated due to similar marginal costs but uncorrelated with the affected drug's unobserved product characteristics. The instruments pass the usual tests and are reported in the bottom panel of each table.

In Table 3 we present empirical results from Eq. (1). Model 1 presents estimates at the drug level, Model 2 presents estimates at the 4-digit ATC market level and Model 3 presents estimates at the 3-digit ATC market level. Model 1 can be viewed as testing the casual impact of drug relabeling on aggregate drug demand while Model 2 captures intramarket drug substitution. In other words, Model 2 helps us understand if physicians switch consumers to another drug in the same 4-digit ATC market. An example of such a substitution would be a switch from the anti-viral Invirase Ⓡ to Norvir Ⓡ . Finally, Model 3 captures inter-market drug substitution. In this case, physicians switch patients to another drug in a different 4-digit ATC market but within the same 3-digit ATC market. In the prior example, both Invirase Ⓡ and Norvir Ⓡ are in the 4digit ATC market J5AE (protease inhibitors). In the current example, a physician would be switching a patient from either of those two drugs to Retrovir Ⓡ , which is in the 4-digit ATC market J5AF (nucleotide reverse transcriptase inhibitors). All three drugs are treatments for HIV and both 4-digit ATC markets, J5AE and J5AF, are contained within the 3-digit ATC market J5A (direct acting antivirals).

The dependent variable across all three models is Sales and includes our full set of controls. In Model 1 we include drug and time fixed effects while in Models 2 and 3 we include market and time fixed effects. Price is instrumented in all models and passes the usual test statistics, which are reported at the bottom of the table. Standard errors are clustered at the 2-digit ATC market level. 29 The coefficient of interest is the interaction term (Relabel * U.S.); it is negative and statistically significant across all models. In Model 1 we find a 16.9% decline in aggregate drug sales caused by the first instance of a drug relabel. 30 When we aggregate within 4-digit ATC markets in Model 2 we find a 5.1% decline in aggregate sales. Importantly, this model accounts for demand of the affected drug that was absorbed by other drugs within that same 4-digit ATC market. In other words, physicians engaged in intra-market substitution and switched patients to another drug within the same therapeutic market. From the previous example, this would be a switch from Invirase Ⓡ to Norvir Ⓡ within the 4-digit ATC market J5AE. 31 This is not the only substitution that can take place. It is possible that physicians can engage in inter-market substitution and switch consumers to another drug in a different 4-digit ATC market but still within the same 3-digit ATC market. Again, in the above example, this would be a switch from Invirase Ⓡ (4-digit ATC market J5AE) to Retrovir Ⓡ (4-digit ATC market J5AF) which are both in 3-digit ATC market J5A. In Model 3 we find a 4.7% decline in sales for drugs within a 3-digit ATC market that experienced a relabel. Critically, the result in Model 3 implies that Table 4 Effects of relabeling on demand: Low-intensity markets. Dependent variable is the natural logarithm of sales, ln(Sales). Low-intensity markets are defined as those 4-digit ATC markets where there was only one relabeling event over our sample period. The unit of analysis in Model 1 is the drug level, Model 2 is the 4digit ATC market (ATC4) level and Model 3 is the 3-digit ATC market (ATC3) level. Price is instrumented in all models with relevant tests reported in the table. Controls include Vintage, Number of brands, and Number of generics. The models are log-linear, as such the marginal effects are calculated using the equation exp (β− 1) where β is the respective coefficient on our variable of interest, (Relabel*U.S.). Marginal effects for our variable of interest are reported in the lower panel. Standard errors are clustered at the 2-digit ATC market level. Constants are included in all specifications but omitted from the table. * p < 0.10, ** p < 0.05, *** p < 0.01. (Table 3) . Again, results remain consistent. 31 In Tables A2 and A3 we test alternative treatment periods. First, in Table A2 we consider time periods of three (Model 2) and four years (Model 3) before and after a drug relabeling. Our base model (Model 1, Table 3 ) is included as Model 1 for comparative purposes. Second, in Table A3 we widen the treatment window around the actual drug relabel. As a reminder, our baseline model excludes the quarter when a relabeling event occurred. In Model 1 and Model 2 we increase that exclusion to one and two quarters, respectively, before the quarter of relabel. This increase in exclusion will help if information leaks prior to announcement. All of the robustness results are consistent with our main findings in Table 3 .

after controlling for affected firm and competitor actions and capturing intra-and inter-market substitution patterns aggregate demand still declined by 4.7%.

It is important to recall the process that is involved with these types of substitutions. Only a physician can switch a consumer to another drug. While we can detect ex post that a substitution has occurred, we do not know what precipitated the move. 32 There are several possibilities. First, consumers could become informed of the relabel and push a physician to switch them. Second, physicians could independently learn about the relabel and decide to proactively switch a consumer either for medically related reasons or for defensive medicine concerns. Third, physicians could learn about the relabel through detailing, either by the affected company or by a competitor and then decide to switch a consumer to another drug. These explanations are not mutually exclusive and there is recent evidence to support the role of detailing (Macher and Wade, 2016) . 33 Given that our data is at the standard unit level we do not know exactly how many consumers this represents because prescription patterns will differ across drugs and consumers. We can, however, calculate a conservative, lower bound if we assume that the loss was for chronic conditions that require daily uptake. Under this assumption, we can multiply the decline in aggregate demand from Model 3 by average sales over the two-year sample period prior to the relabeling event. This translates into an estimated decline of 7.97 million standard units or slightly over 265,000 30-day prescriptions. If all of these prescriptions were for chronic conditions then this translates into a loss of approximately 11,000 consumers. 34 Again, this is likely to be a conservative, lower bound estimate because not every prescription is for a chronic condition requiring a daily dose. As the number of prescriptions for acute conditions increase so would the resulting loss.

Relabeling intensity varies across therapeutic markets (see Table A4 ). In Tables 4 and 5 we explore how these differential intensities impact aggregate demand. We divide our data into two sub-samples and define 'low-intensity markets' and 'high-intensity markets'. 35 In Table 4 , low-intensity markets are defined as those 4-digit ATC markets where there was only one relabeling event over our sample period. In contrast, in Table 5 , we define high-intensity markets as those 4-digit ATC markets where more than one relabeling event occurred over the sample period. In Table 4 , Model 1 the coefficient on the interaction term (Relabel * U.S.) is negative and statistically significant at the one percent level. We find a decline of 10.8% in aggregate demand for drugs in these low-intensity markets. Interestingly, however, in Model 2 and Model 3 the interaction is not statistically significant. This suggests that intramarket substitution absorbed the decline in aggregate drug demand. Table 5 Effects of relabeling on demand: High-intensity markets. Dependent variable is the natural logarithm of sales, ln(Sales). High-intensity markets are defined as those 4-digit ATC markets where there was more than one relabeling event over our sample period. The unit of analysis in Model 1 is the drug level, Model 2 is the 4-digit ATC market (ATC4) level and Model 3 is the 3-digit ATC market (ATC3) level. Price is instrumented in all models with relevant tests reported in the table. Controls include Vintage, Number of brands, and Number of generics. The models are log-linear, as such the marginal effects are calculated using the equation exp (β− 1) where β is the respective coefficient on our variable of interest, (Relabel*U.S.). Marginal effects for our variable of interest are reported in the lower panel. Standard errors are clustered at the 2-digit ATC market level. Constants are included in all specifications but omitted from the table. * p < 0.10, ** p < 0.05, *** p < 0.01. In other words, in these markets physicians were successfully able to switch consumers to another drug within that same 4-digit ATC market.

To the extent that consumer or physician concerns are warranted due to a relabeling event, this is the expected outcome.

In high-intensity markets, on the other hand, results are more complex. Across all models in Table 5 the interaction term is negative and statistically significant. In Model 1 aggregate drug demand declined by 18.9% while in Model 2 aggregate demand declined by 6.0% for drugs within a drug's 4-digit ATC market. As before, Model 2 represents intramarket substitution or consumers being switched to other drugs within the same 4-digit ATC market. Shifting to the 3-digit ATC market that incorporates inter-market substitution patterns, Model 3, aggregate demand declined by 5.0%.

In Tables A5 and A6 we redefine low-intensity and high-intensity markets as those markets in the bottom and top quartile of relabeling activity. 36 Results remain robust with those reported in Tables 4 and 5. In low-intensity markets, Table A5 , Model 1 aggregate demand declined by 10.3%. The interaction was not significant in Model 2 or Model 3 again suggesting that intra-market substitution absorbed the entire decline. For the high-intensity markets, Table A6 , Model 1 aggregate drug demand declined by 20.1%. In Model 2, which incorporates intramarket substitution patterns, aggregate demand declined by 13.0%. Finally, in Model 3 that incorporates inter-market substitution, aggregate demand declined by 8.3%; markets with repeated negative shocks appear to reinforce consumers' behavioral responses.

As discussed in Section 2 the severity of drug relabeling spans from precaution (least serious) through box warnings (most serious). Table 6 explores whether the aggregate demand response we document varies across this continuum of severity. We split the data into three subsamples representing precaution (Model 1), adverse reaction (Model 2) and warning/box warning (Model 3). The categorization continues to be based on the first time a drug is relabeled and allows us to isolate out the effects of any potential prior relabeling activity. Drugs that have multiple types of relabeling are counted individually in each category. 37 Across all models the interaction remains negative and statistically significant. As expected, we see an increasingly negative aggregate demand response as severity increases; aggregate demand declines by 15.6%, 20.3% and 36.3% in Models 1, 2 and 3, respectively.

The increasing decline in aggregate demand as severity increases should not be surprising; physicians appear to be switching consumers to other drugs as new potential risks reveal themselves. Notwithstanding this general decline, the magnitude of results in Model 1 are unexpected. This appears to be a rather strong aggregate demand response given the limited severity of the relabeling event. Unfortunately, we don't know what caused physicians to react in such a significant way. That said, if the response is medically warranted or if physicians believe there may be future problems with a relabeled drug, then we should see intra- Table 7 Effects of precaution/adverse reaction relabeling on demand. Dependent variable is the natural logarithm of sales, ln(Sales). Sample includes the combination of precaution and adverse reaction. The unit of analysis in Model 1 is the drug level, Model 2 is the 4-digit ATC market (ATC4) level and Model 3 is the 3-digit ATC (ATC3) level. Price is instrumented in all models with relevant tests reported in the table. Controls include Vintage, Number of brands, and Number of generics. The models are log-linear, as such the marginal effects are calculated using the equation exp (β− 1) where β is the respective coefficient on our variable of interest, (Relabel*U.S.). Marginal effects for our variable of interest are reported in the lower panel. Standard errors are clustered at the 2-digit ATC market level. Constants are included in all specifications but omitted from the table. * p < 0.10, ** p < 0.05, *** p < 0.01. market substitution absorb this decline. 38 We examine this in Table 7 where we split the sample and combine the two least severe relabeling events (i.e., precaution and adverse reaction) together. Again, across the models we find a negative and statistically significant coefficient on our interaction of interest. At the drug level, Model 1, aggregate demand declined by 14.7% while at the 4-digit ATC market level, which incorporates intra-market substitution, aggregate demand declined by 5.1%. At the 3-digit ATC market level, Model 3, which accounts for inter-market substitution aggregate demand still declines by 4.0%.

In Table 6 Model 3, aggregate demand declined by 36.3% for drugs that received either a warning or box warning. This response should not be surprising given the severity of the relabeling event. In Table 8 , we combine warnings and box warnings and examine their intra-and intermarket substitution patterns. Across all three models in Table 8 our coefficient on the interaction term is negative and statistically significant. At the 4-digit ATC market level that incorporates intra-market substitution patterns (Model 2), aggregate demand declined by 10.0%. At the 3-digit ATC market level that accounts for inter-market substitution patterns (Model 3), aggregate demand declined by 8.3%. As the severity of the relabeling event increases (Table 7, Model 3 versus  Table 8 , Model 3) the aggregate demand response increases as well. 39 Importantly, given the substitution patterns captured within Model 3, consistent with prospect theory, consumers appear to be viewing potential substitutes in the same negative manner as the affected drug.

Finally, we combine the intensity levels of relabeling activity from the prior section and examine how it impacts the heterogeneity of relabeling severity that we considered in this section. In Tables A9 and  A10 we replicate Tables 7 and 8 for low-intensity markets. Results are consistent with our prior findings (Table 4 and Table A5 ). In Tables A9 and A10 we see declines in aggregate demand (Model 1) of 6.6 and 45.0%, respectively. Results in Models 2 and 3 are not statistically significant, suggesting that the entire decline in aggregate drug demand was absorbed by intra-market substitution.

In Tables A11 and A12 we replicate Tables 7 and 8 for high-intensity markets. Again, results are consistent with our prior findings for highintensity markets (Table 5 and Table A6 ). For relabeling events that involved precaution or adverse warnings in high intensity markets, aggregate demand declined by 17.3% (Table A11 , Model 1). At the 4digit ATC market (Model 2) that incorporates intra-market substitution patterns, aggregate demand declined by 5.9%. Finally, at the 3-digit ATC market level (Model 3) that incorporates inter-market substitution patterns, aggregate demand declined by 4.8%. The most significant declines are in high-intensity markets with warnings or box warnings (Table A12 ). Aggregate demand declined by 34.3% at the drug level (Model 1), 10.4% at the 4-digit ATC market level (Model 2), and 15.8% at the 3-digit ATC market level (Model 3). Unlike low-intensity markets where intra-market substitution absorbed the decline in aggregate drug demand, in high-intensity markets we see significant declines in aggregate demand.

Prior research has demonstrated that positive demand shocks generate more R&D (Blume-Kohout and Sood, 2013; Dranove et al., 2014; Manso et al., 2019) . To the extent that a negative shock decreases market size, we would expect to see a decline in R&D (Acemoglu and Linn, 2004) ; which is consistent with Krieger et al. (2018) . In an effort to estimate the economic losses from these regulatory shocks on firm performance we conduct an event study. The advantage of the event study methodology in this instance is that it will capture the loss in future discounted cash flows from two sources: (1) the unexpected losses in revenues from the relabeled drug over its remaining lifecycle; and, (2) the unexpected losses from declines in future innovation.

We follow McWilliams and Siegel (1997) to compute cumulative abnormal returns (CAR). First, we estimate the market model using OLS over a period of 250 days prior to the event. The estimation equation is the following:

where R it is the return for firm i at time t and R mt is the market return. The estimated OLS parameters represent the stock's "normal" return with respect to the market in a period prior to the event. The abnormal return (AR) is defined as the return during a time span that includes the relabeling event minus the estimated return accounting only for the market effect. In other words, the abnormal return is the forecast error between the "actual" and the "normal" rate of returns. Empirically it is estimated as:

After estimating the abnormal returns for each firm i at time t, CAR is computed as the cumulative value of the standardized abnormal returns or:

where AR it is defined by Eq. (3), SD it is the abnormal return standard deviation and k represents the event window. We consider two different standard event windows (− 1,+1) and (− 3, +1). The event date is defined as t = 0 and it represents the date of the public announcement of the relabeling event. Thus, the first event window considers the day of the event plus one day on other side of the event. The second event window considers the day of the event plus one day after and three days prior to the event. Finally, we multiply CARs by firm market capitalization data obtained from COMPUSTAT. The monetized value of CAR represents the unexpected change in the stream of future discounted cash flows from the two sources identified above. We find CARs of − 0.49% and − 0.76%, significant at the 1% level, across the two event windows, (− 1, +1) and (− 3, +1), respectively. Multiplying these CARs by firm market capitalization data translate into average losses of between $569 million and $882 million. A back of the envelope calculation allows us to parse the losses into their two sources. From Model 1, Table 3 we know that drug demand falls, on average, by 16.9%. Multiplying this by average quarterly sales (21.2 million SU x 16.9%) and dividing by three equals monthly losses of 1.2 million SU. Next, we multiply this loss in demand by median price ($2.36) and by the average effective remaining patent life (66 months) for a total loss of $186 million. 40 The remaining difference of between $383 million and 38 The average probability that a drug that has received a precaution receives another relabel is 72.2%. As such, physicians may be pre-emptively switching patients to another drug. However, in this case we should see the entirety of aggregate demand decline of a drug absorbed by intra-market substitution. 39 In Tables A7a, 7b, 8a and 8b we consider alternative time periods. First, in Tables A7a and 7b we consider three and four years before and after a relabeling event (as opposed to two years in our baseline model). Second, our baseline model excludes the quarter in which a relabeling event occurred. In Tables A8a and 8b we exclude one and two quarters prior to the relabeling event (along with the quarter of the event). In both tables and across all models our results remain robust to our baseline findings. 40 Grabowski and Vernon (2000) report an average effective patent life for branded drugs of 11.5 years. The mean/median of Vintage is 24 quarters or 6 years resulting in an average remaining effective patent life of 5.5 years or 66 months.

$696 million can be attributed to the unexpected losses from declines in future innovation. 41 Importantly, this result combined with Krieger et al. (2018) , provides evidence that the impact on upstream innovation from downstream regulatory shocks are significant.

It may be possible that variation in market size or the level of competition within markets may differentially influence physician prescribing behavior or consumer behavior. For example, business or general news stories may enhance physician or consumer awareness about a drug. We examine these issues in Table A13 . In Models 1 and 2 we separate markets into the bottom and top quartiles of sales while in Models 3 and 4 we create a Herfindahl-Hirschman Index (HHI) and separate markets into the bottom and top quartiles, respectively. Across all models we find a negative and significant coefficient on our interaction term. Aggregate demand declined by 9.5% and 19.8% in the bottom and top sales quartiles (Models 1 and 2), respectively. However, when we consider the bottom and top quartiles of HHI, the difference becomes negligible. In Models 3 and 4, aggregate demand declined by 22.8% and 21.3%, respectively. Thus, we see some variation in response across market sizes but not across levels of competition.

A benefit of the breadth of our data is that we capture all therapeutic markets; the impacts we find are average effects across these markets. Lost in our analysis, however, is the potential heterogeneity that may exist between markets. Thus, we examine two therapeutic markets that, according to our discussions with physicians and prior research, exhibit significantly different adherence rates and treatment periods. The first market we consider is ATC N (nervous system), which is comprised of seven 2-digit ATC therapeutic markets: anesthetics (N01), analgesics (N02), antiepileptics (N03), anti-Parkinson (N04), psycholeptics (N05), psychoanaleptics (N06) and other nervous system drugs (N07).

Within these 2-digit ATC markets we have additional 3-digit and 4digit ATC markets. For example, within N06 resides anti-depressants (N06A) and anti-dementia (N06D) drugs. In general, ATC N exhibits lower levels of non-adherence and longer treatment periods than our second therapeutic market. One study places the non-adherence rates of antiepileptic drugs at 26% (Faught et al., 2008) . In Table A14 we find a decline in aggregate drug demand of 21.4% (Model 1), however, the coefficient of interest is not significant in Model 2 or Model 3. These markets experience greater declines in aggregate demand, in%age terms, than we saw for the overall sample, however, the entire decline is absorbed by intra-market substitution. That is, physicians successfully switch consumers to other drugs within the same 4-digit ATC market.

The second market that we consider is ATC J (anti-infectives), which is comprised of six 2-digit ATC markets: anti-bacterials (J01), antimycotics (J02), anti-mycobaterials (J04), anti-virals (J05), immune sera and immunoglobulins (J06), and vaccines (J07). The 2-digit ATC market J01 includes 10 different 3-digit ATC markets comprising various classes of anti-bacterials; for example, tetracyclines (J01A) and beta-lactam anti-bacterials/penicillins (J01C). In general, these ATC markets exhibit greater rates of non-adherence and shorter treatment periods than ATC N. Two recent studies (Fernandes et al., 2014; Tong et al., 2018) place the non-adherence rates for antimicrobial therapies at greater than 57%. In Table A15 we find a decline in aggregate demand of 24.2% (Model 1). In these markets, however, we also see declines of 13.8% and 13.5% in the 4-digit (Model 2) and 3-digit (Model 3) ATC markets, respectively.

While we only explore two markets we see rather significant heterogeneity in physician substitution patterns and consumer response. These two markets were intentionally chosen because they differed in non-adherence rates and average treatment lengths. Unfortunately, we lack the data to say for certainty what specific attribute of these markets caused the physician and consumer responses that we observed. What we can say, however, is that there appears to be significant heterogeneity across markets and this has implications for firm and competitor responses as well as for regulators. Further work exploring the why behind these movements is clearly warranted.

Regulatory interventions rarely occur without consequences, many a times unintended. Understanding how they impact markets are important for both firms and policymakers, especially in markets that are R&D intensive, like pharmaceuticals. While we are not the first to analyze the impacts of drug relabeling in the U.S, we are the first to do so in such a comprehensive and causal manner. Given the breadth of our data we are able to incorporate all plausible intra-and inter-market substitution patterns along with affected firm and competitor actions. This allows us to estimate not only the causal impact of a relabeling event on a drug but also quantify the overall effects on aggregate demand. In our baseline regressions (Table 3 , Model 1) we find a decline in aggregate drug demand of 16.9%. Our back of the envelope calculation suggests that this decline translates into an average loss of $186 million over the drug's remaining effective patent life. This is not the only loss that the affected firm suffers. In addition to losses in current and future sales from the affected drug, there can be unexpected losses from declines in future innovation (Krieger et al., 2018) . In order to calculate the economic losses from these combined effects we utilize an event study. Results across two standard event windows translate into average losses of between $569 million and $882 million. Backing out the $186 million from the average loss to the affected product suggests that the market is anticipating unexpected losses attributable to declines in future innovation in the range of $383 million to $696 million. Combined with Krieger et al. (2018) our results suggest that these regulatory shocks are causing significant enough damage to downstream aggregate demand such that upstream innovation is being impacted.

When we take a step back and consider intra-market substitution patterns, or the shifting of consumers to another drug within the same 4digit ATC market, we find an aggregate demand decline of 5.1% (Table 3 , Model 2). In a different setting, Macher and Wade (2016) find that competitors take advantage of drugs when they are hit with black-box warnings. Our findings are broadly consistent with Macher and Wade (2016) as this specification controls for both the promotion activity by the affected firm as well as competitor firms. However, the extent to which competitor firms 'pull' consumers via promotion activity or they are 'pushed' by physicians due to behavioral explanations, is undetermined.

If we take yet another step back and consider both intra-and intermarket substitution patterns, or the shift of consumers to another drug in a different 4-digit ATC market but within the same 3-digit ATC market, we still find a decline in aggregate demand of 4.7% (Table 3, Model 3). This result suggests that not all consumers are absorbed by competitors after accounting for all plausible substitution patterns; some consumers prematurely leave the market. Thus, across our baseline results these regulatory shocks have implications for affected firms, intra-41 More broadly, this can be viewed as a loss to the firm. Given the underlying assumptions of an event study, these losses are abnormal or unexpected. The first source of loss are the future sales of the relabeled drug, which we attempt to approximate. Given the findings in Krieger et al. (2018) the next obvious source of unexpected loss would be to future innovation. It remains plausible that our results are also detecting other types of loss which remain unknown. For example, declining sales of the affected drug may cause inefficiencies in other downstream cospecialized assets thereby raising costs to the firm. market competitors, inter-market competitors and welfare, which we discuss below. Importantly, all of these results should be viewed as lower bounds. Given evidence that not all consumers and physicians may be fully informed of these regulatory shocks it is probable that the effects we document may not be capturing the full aggregate demand shock and impacts on firm performance.

Complementing our baseline results, we find increasing impacts across all levels of relabeling severity (Table 6) . Consistent with prior literature (e.g., Dorsey et al., 2010) we find the greatest impact for the most severe type of relabel. Less intuitive, however, is why we see such a significant demand response for the least severe relabel (i.e., precaution). Conditional on receiving a precaution, there is a significant probability that a drug will be relabeled again in the future. So it is plausible that physicians are preemptively switching consumers to other drugs. After accounting for intra-and inter-market substitution (Table 7) we find a 4.0% decline in aggregate demand. While we conjectured in the paper as to physician and consumers motivations, understanding their respective why is left for future work.

We exploit other variation in our data. For example, we break markets into "low-intensity" and "high-intensity" markets based on the level of relabeling activity within a particular 4-digit ATC market. In the case of low-intensity markets (Tables 4 and A5 ) and low-intensity markets across types of relabeling (Tables A9 and A10), we find that the entire decline in aggregate demand was absorbed by intra-market substitution. That is consumers were all successfully switched to other drugs within the same 4-digit ATC market. In contrast, in the case of high-intensity markets (Table 5 and Table A6 ) and high-intensity markets across types of relabeling (Tables A10 and A11 ) we find persistent declines in aggregate demand. This split is an important caveat to the extant literature, especially the work focused on box warnings (e.g., Dorsey et al., 2010; Jacoby et al., 2005) because it suggests the impacts are more nuanced.

A significant body of work has focused on elasticity and brand loyalty within the pharmaceutical industry (e.g., Bala et al., 2017) . These issues are critical, for example, for pricing strategies and how firms respond to competitors and structure end of life strategies of branded products. The culmination of our baseline findings suggest that firms should also be concerned with the magnitude of consumer (and physician) response to adverse news from relabeling events. While some of these shifts may be medically warranted, others may be due to competitor behavior (Macher and Wade, 2016) , physicians responding defensively, consumers acting irrationally or some combination of these. Given that we control for firm detailing/promotion activities our findings suggest that affected firms are not able to stem the decline in demand. All of this suggests that how physicians (and consumers) receive information may have important implications.

More broadly, our results have implications for policymakers. There are a number of FDA programs that offer expedited development and review for new drugs. These programs all attempt to bring new, novel drugs to market more quickly. Evidence exists that these programs have been successful (Chambers et al., 2017) . However, drugs approved through these expedited pathways are also more likely to suffer from serious safety label changes (Mostaghim et al., 2017; Moore and Furberg, 2014; Carpenter et al., 2008) . As we have documented throughout this analysis, those changes have significant impacts on firm performance, including downstream aggregate demand as well as upstream innovation. These impacts add another layer of complication for regulators to consider in balancing safety with speed. Importantly, these results combined with our results and of those of Krieger et al. (2018) all point in the direction that we may be trading quicker access to new, novel drugs today for less innovation tomorrow.

This trade-off suggests our results have plausible welfare implications. By its nature, regulation should be welfare enhancing but there is evidence that this may not always be the case (e.g., Kessel, 1967; Sloan and Steinwald, 1980; Thomas, 1985, 1987; Peltzman, 1987;  Ter-Martirosyan and Kwoka, 2010). If consumers that leave the market should be treated, then this shift to the non-treated population could be a detriment to welfare. Moreover, if consumers remain treated but are switched to drugs that are less effective, this will again be a detriment to welfare. On the other hand, it is widely believed that some drugs are overprescribed (Lembke et al., 2018; Sacarny et al., 2016; Forgacs and Loganayagam, 2008; Price et al., 1986) . If it is these consumers that exit the market then the impact on welfare may be dampened. Combined with this dynamic is the impact on welfare from lost future innovation. Balanced against these potential losses are the gains from the true purpose of relabelingpotentially preventing consumers from being harmed. As we can observe in the recent past, the world is already witnessing a demonstration of these tradeoffs in rapid approval of COVID-19 drugs and vaccines. Further work is warranted. 

Market size and innovation: theory and evidence from the pharmaceutical industry

The Impact of Regulation on Innovation

Demand drops and innovation investment: evidence from the great recession in Spain

The impact of information technology on the diffusion of new pharmaceuticals. NBER working paper 23257

Pharmaceutical product recalls: category effects and competitor response

Negative Shocks and Innovation: Evidence from Medical Device Recalls

Intercultural business communication: a rich field of studies

Direct and indirect effects of regulation: a new look at OSHA's impact

Predation through regulation: the wage and profit effects of the occupational safety and health administration and the environmental protection agency

Market size and innovation: effects of Mediare Part D on pharmaceutical research and development

Regulation and welfare: evidence from Paragraph-IV generic entry in the pharmaceutical industry. NBER working paper 17188

A critical review of methods to evaluate the impact of FDA regulatory actions

Dynamic modeling of inventories subject to obsolescence

Pharmaceutical product withdrawal: attributions of blame and its impact on trust

Drug-review deadlines and safety problems

The unexpected consequences of generic entry

Drugs cleared through the FDA's expedited review offer greater gains than drugs approved by conventional process

Strategic management of R&D pipelines with cospecialized investments and technology markets

The corporate social responsibility of pharmaceutical product recalls: an empirical examination of US and UK markets

The relation between research priorities and societal demands: the case of rice

National trends in the safety performance of electronic health record systems from

Productivity in pharmaceuticalbiotechnology R&D: the role of experience and alliances

Effects of physician-directed pharmaceutical promotion on prescription behaviors: longitudinal evidence

Impact of FDA black box advisory on antipsychotic medication use

Health care markets, regulators, and certifiers

The economic side effects of dangerous drug announcements

In: Pharmaceutical Profits and the Social Value of Innovation

Market size and pharmaceutical innovation

Impact of FDA drug risk communications on health care utilization and health behaviors: a systematic review

Nonadherence to antiepileptic drugs and increased mortality: findings from the RANSOM study

Non-adherence to antibiotic therapy in patients visiting community pharmacies

Static and dynamic effects of health policy: evidence from the vaccine industry

Overprescribing proton pump inhibitors

Product recalls, imperfect information, and spillover effects: lessons from the consumer response to the 2007 toy recalls

Effective patent life in pharmaceuticals

Inventing and maximizing

The economics of health, safety, and environmental regulation

Management Mistakes & Successes

The intensity and extensity of knowledge and the multinational corporation as a nearly recomposable system (NRS). Manag

Market size and innovation: the intermediary role of technology licensing

The outsourcing of R&D through acquisition in the pharmaceutical industry

A mixed methods approach to assessing actual risk readership on branded drug websites

After the black box warning: dramatic changes in ED use of droperidol

Controlling Hospital Costs: The Role of Government Regulation

Prospect theory: an analysis of decision under risk

Trends in antipsychotic use in dementia

Economic effects of rederal Regulation of milk markets

Innovation: the interplay between demandside shock and supply-side environment

Find and Replace: R&D Investment Following the Erosion of Existing Products

The economic consequences of legal origins

Our other prescription drug problem

Generalizing what is known about temporal aggregation and advertising carryover

Changes in antidepressant use by young people and suicidal behavior after FDA warnings and media coverage: quasi-experimental study

In: The "Black Box" of Strategy: Competitive Responses to and Performance Responses to Adverse Regulatory Events

The effects and role of direct-to-physician marketing in the pharmaceutical industry: an integrative review

Heterogeneous Innovation and the Antifragile Economy. U.C. Berkeley. Working paper

Event studies in management research: theoretical and empirical issues

Safety related changes for new drugs after approval in the US through expedited regulatory pathways: retrospective cohort study

Communicating safety information to physicians: an examination of dear doctor letters

Controls versus subsidies in the economic theory of regulation

Serious adverse drug events reported to the Food and Drug Administration

Development time, clinical testing, postmarket followup, and safety risks for the new drugs approved by the US food and drug administration: the class of

Social capital, intellectual capital, and the organizational advantage

Information and consumer behavior

Measuring market power in the ready-to-eat cereal industry

Invention, Growth, and Welfare: A Theoretical Treatment of Technological Change

Information regulation: do the victims of externalities pay attention?

A case-control study of antidepressants and attempted suicide during early phase treatment of major depressive episodes

Effects of Food and Drug Administration warnings on antidepressant use in a national sample

Pharmaceutical policy change and the safety of new drugs

The risk we bear: the effects of review speed and industry user fees on new drug safety

An explanation into the determinants of research intensity

Tightening environmental standards: the benefitcost or the no-cost paradigm?

The global crisis and firms' investments in innovation

Toward a more general theory of economic regulation

The health effects of mandatory prescriptions

The impact of technology-push and demand-pull policies of technical change -does the locus of policies matter? Res

PP024-Adverse drug events and medication errors related to psychotropic drugs in patients presenting at an emergency department

Meta-analysis of the heritability of human traits based on fifty years of twin studies

Doctors' unawareness of the drugs their patients are taking: a major cause of overprescribing?

Market withdrawl of new molecular entities approved in the United States from 1980 to

Generic drug industry dynamics

Advertising and competition in the ethical pharmaceutical industry: the case of antihypertensive drugs

Medicare letters to curb overprescribing of controlled substances had no detectable effect on providers

Invention and Economic Growth

Capitalism, Socialism and Democracy

Entry decisions in the generic drug industry

Responding to an FDA warning-Geographic variation in the use of rosiglitazone

Effects of regulation on hospital costs and input use

Contraindicated use of cisapride: impact of food and drug administration regulatory action

Theory of economic regulation

The relationship between suicidal behavior and productive activities of young adults

Incentive regulation, service quality, and standards in U.S. electricity distribution

Patient compliance with antimicrobial drugs: a Chinese study

Advances in prospect theory: cumulative representation of uncertainty

Understanding choice: why physicians should learn prospect theory

Impact of safety warnings on drug utilization: marketplace life span of cisapride and troglitazone

Pharmaceutical promotion and GP prescription behavior

Estimation of clinical trial success rates and related parameters

Adverse drug event surveillance and drug withdrawals in the United States, 1969-2002: the importance of reporting suspected reactions

Consumer perceptions of price, quality, and value: a means-end model and synthesis of evidence

The financial impact of product recall announcements in China

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.The authors declare the following financial interests/personal relationships which may be considered as potential competing interests.

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.respol.2020.104126.