key: cord-0136289-es2d10x7 authors: Siegel, Jonathan; Grinsted, Lynda; Liu, Feng; Weber, Hans-Jochen; Englert, Stefan; Casey, Michelle title: Censoring and censoring mechanisms in oncology in light of the estimands framework date: 2022-03-03 journal: nan DOI: nan sha: f063eff6ceb144e803f7a239ae4a742f7cabb0ed doc_id: 136289 cord_uid: es2d10x7 In oncology clinical trials with time-to-event endpoints, censoring rules have traditionally been defined and applied following standard approaches based on longstanding regulatory guidelines. The estimand framework (addendum to the ICH E9 guideline) calls for precisely defining the treatment effect of interest to align with the clinical question of interest, and requires predefining the handling of intercurrent events that occur after treatment initiation and either preclude the observation of an event of interest or impact the interpretation of the treatment effect. In the context of time to event endpoints, this requires a careful discussion on how censoring rules are applied. This paper explains the importance of distinguishing censoring concepts that have traditionally been merged. Specifically, noumenal censoring as an estimation method to address an intercurrent event which influences the clinical question itself (i.e. defines the pre-specified estimand), distinguishing it from phenomenal censoring that addresses administratively missing information (i.e. defines missing data handling in the estimate). Strategies for dealing with the most relevant intercurrent events, the need for close alignment of the clinical question of interest and study design, impact on data collection and other practical implications will be discussed. The authors recommend defining trial-specific rules as well as considering when noumenal and phenomenal censoring is used. These considerations also apply to defining relevant and interpretable sensitivity and supplementary analyses. With the release of the addendum to the ICH E9 guidance 1 , the estimand framework is being applied widely in the design and analysis of clinical trials. The estimand framework provides the terminology to clearly describe what a clinical trial is intending to assess and consequently, which data need to be collected and how these data are to be analysed. In light of the estimand framework, this paper provides a discussion on the impacts of censoring in time to event analyses on the clinical question of interest. This paper is a product of the The estimand framework defines strategies to deal with intercurrent events, events occurring after randomization or treatment initiation that affect either the interpretation or the existence of the measurements associated with the clinical question of interest. It is important to differentiate missing data (e.g., data not available due to trial completion or participant withdrawal from study) from intercurrent events. As outlined in the ICH E9 R1 guidance 1 , strategies for handling intercurrent events include the treatment policy strategy considering the intercurrent event as irrelevant, the hypothetical strategy addressing a clinical question when the event would not have occurred, the composite variable strategy when the event is considered informative for the outcome and thus, is defined as part of the variable of interest, the while on treatment strategy when the treatment effect up to the onset of the intercurrent event is relevant and the principal stratum strategy when the effect is considered relevant only for a subset of the study population 2 . In oncology trials, assessing the treatment effect by time to event endpoints like progressionfree survival (PFS) is typically of high interest and referenced in health authority guidelines. In the past, censoring rules for time to event endpoints have been proposed within protocols and analysis plans without giving consideration to the fact that different event and censoring rules might address different clinical questions of interest. In addition, in the past, no distinction was made between censoring as an estimation method to address an intercurrent event which influences the clinical question itself (i.e., defines the a priori defined estimand), and censoring that addresses missing information (i.e., defines missing data handling in the estimate). The terms noumenon and phenomenon are borrowed from philosophy to highlight this differentiation: In Kantian epistemology, a noumenon is an unobservable thing-in-itself, and a phenomenon is what is observable to the senses. Acknowledging that these terms are used outside of their original philosophical context, the authors believe they capture the key distinction between what is intended to be estimated but not necessarily observed, the estimand, and what is actually observed in the course of the trial, the main estimate of the main estimator, and hence represent useful and appropriate terms to convey this important conceptual distinction in the context of clinical trial design and analysis 3 . In the following, we discuss the technique of censoring in oncology trials under the perspective of the estimand framework. Based on the revised FDA endpoint guidances 4, 5 and EMA Guideline on the evaluation of anticancer medicinal products in man Appendix 1 6 , strategies to address typical intercurrent events in oncology are presented with examples. These show how different strategies can impact the scientific question of interest. Finally, recommendations to define the relevant estimand with regard to typical intercurrent events are given. It is acknowledged that other types of data, such as binary and continuous data collected for safety and quality of life endpoints are included in many Oncology trials. Given the focus on censoring, this paper concentrates on time to event efficacy endpoints. Additionally, although all 5 attributes of the estimand are important, different summary measures are not described in this discussion on censoring. Classical time-to-event analysis methods are based on the critical assumption that when the event of interest has not yet occurred and patients are censored, this censoring is noninformative with respect to the scientific question of interest and hence can be appropriately handled by censoring. Non-informative censoring is independent of, and provides no information about, the question of interest. In contrast, patients who discontinue participation in a trial for treatment-related reasons such as perceived lack of efficacy or treatment side effects may well have a different risk of experiencing the event in the future than other patients (informative censoring). In addition, it has traditionally been common to handle certain types of intercurrent events, such as subsequent therapy, with censoring even if the event of interest is observed afterwards. It is well acknowledged that informative censoring is a source of bias for estimation 7 . Approaches have been developed to adjust for informative censoring requiring more or less stringent assumptions between censoring times and baseline covariates 8, 9, 10, 11 . Competing risk analyses might be performed if the event of interest cannot be observed when other informative event(s) prevent further follow-up 12 . Other authors suggest optimizing trial designs to avoid or minimize informative censoring 13 . If informative censoring is considered to be minimal the potential bias is usually ignored. Reporting the number of dropouts is suggested to increase transparency 14 . Additionally, an assessment of censoring patterns including the distribution of censoring reasons across treatment arms is recommended to determine the robustness of results from clinical trials 15 . A common practice is to perform additional analyses using different censoring rules 16 to assess the impact of potential bias. If sensitivity analyses fail to come to the same conclusion as the primary analysis, then bias or lack of robustness is considered. However, bias needs to be contextualized with regard to the scientific question and the corresponding estimation. Bias might remain an unclear concept if assessed in the reverse order. When focussing on the scientific question, the clinical meaning of an intercurrent event and its relevance to the underlying scientific question is the relevant aspect and estimation (e.g. censoring, statistical methodology) becomes a consequence of the scientific question of interest 17 . Every clinical trial has a unique scientific question of interest in the context of a specific treatment condition, , population, variable, intercurrent-event strategy, and population-level summary measure. Typical clinical events in such a setting that have led to censoring need to be reviewed and discussed with regard to their relationship to the clinical question. Bias can only be determined in this context. This perspective underlines the need to define studyspecific rules tailored to the scientific question of interest and to consider the necessary data collection and patient follow-up to address the scientific question of interest. Substantial progress has been made over recent years to understand, diagnose and to treat cancer. Life expectancy has increased in many indications 18 For all time to event endpoints, the time between a triggering event (e.g. randomization or first dose) until a clinical event of interest occurs is analysed. The definition of time to event endpoints often implies a composite strategy defining a set of relevant clinical events. We are interested in the first onset of any of these events. Typical time to event endpoints in oncology are defined as follows 6 : • OS is the time from randomization (or first dose, in non-randomized trials) until death from any cause • PFS is the time from randomization (or first dose, in non-randomized trials) to objective disease progression, or death from any cause, whichever occurs first • DFS is the time from randomization (or first dose, in non-randomized trials) to objective disease recurrence or death from any cause, whichever occurs first • EFS is typically indication specific, but not necessarily consistent in different protocols. For example, for acute myeloid leukaemia EFS was defined as the time from randomization to (induction) treatment failure, relapse for those who have induction treatment success (e.g. complete remission), or death from any cause, whichever occurs first 19, 20 . During the follow-up of patients, situations might occur that either preclude the observation or affect the interpretation of the variable. These are defined in the ICH E9 addendum as intercurrent events. It is important to differentiate missing data from intercurrent events: Random missing data will not bias the estimator nor change the estimand, but weakens the strength of the estimator, typically in loss of statistical power. In contrast, intercurrent events (e.g. discontinuation of therapy due to lack of efficacy or adverse events, initiation of subsequent therapy), if , .not predefined, could change the intended estimand and, depending on the estimand, could introduce bias if ignored. Thus, all potential intercurrent events require an a priori assessment at the design stage to define the appropriate approach on how to deal with them along with a discussion how censoring is applied. According to the estimands guidance 1 , administrative closure of the trial results in missing data, which is a classic example of phenomenal censoring. An individual decision to discontinue treatment based on perceived lack of efficacy or toxicity will generally represent a potential intercurrent event. Both missing data and intercurrent events can occur while the patient is still being followed. For time-to-event variables, the observation period of a patient can be separated into the time up to the occurrence of an intercurrent event or endpoint of interest and the time thereafter. Just by the chronological sequence it can be concluded that the risk of the event of interest for the time interval up to the intercurrent event is not impacted by the intercurrent event but the time thereafter is. A key decision in the estimands framework is whether a particular event or class of events results in missing data or should be classified as intercurrent. In order to make this classification, it is critically important to collect necessary data, particularly data about the reasons for discontinuation or loss to follow-up for the relevant assessments. We suggest that under the estimand framework, only cases where loss to follow-up has no relevant relationship to the treatment effect should be classified as missing data. Cases where there is a potential relationship should generally be considered intercurrent events or integrated in other attributes of the estimand. Given that clinical events may be informative, data collection on the outcome of interest and events that are not a priori defined intercurrent event (but may be considered as such with growing evidence) should continue where feasible. Both missing data and intercurrent events can occur without the need for censoring. An isolated missed appointment is often unrelated to treatment effect and is regarded as missing data. Many potentially biasing events -changes in relevant concomitant medications, assessment methods, etc. -represent intercurrent events, but do not result in a patient being lost to follow-up for the event of interest. In both cases, depending on strategy used, there may, not be a need to censor. Frequently observed intercurrent events in oncology trials include deaths, discontinuation of treatment due to toxicity or subjective clinical progression and initiation of new anti-cancer therapy. In the presence of intercurrent events, the assumption of non-informativity is unlikely to hold. Strategies to address intercurrent events are linked to the three attributes: treatment, population and variable • Treatment policy and hypothetical strategies generally impact the treatment definition, especially when the ICE is treatment related. • Principal stratum strategies impact the population definition • Composite and while-on-treatment strategies impact the variable definition Different estimand strategies can be applied to intercurrent events, depending on the estimand of interest. We discuss each strategy below. The term "treatment policy strategy" comes from the idea of evaluating the effect of the complete sequence of treatments, beginning with randomization and including all treatments thereafter, through treatment switching 7 . When applied to a general intercurrent event, the research question evaluates all outcomes up to the event of interest, through and beyond the applicable intercurrent event. Answering this research question requires following patients through and beyond the intercurrent event (i.e., regardless of treatment discontinuation, subsequent therapy, protocol deviation, or other potential intercurrent events) until the clinical event of interest is observed, the study ends, or the patient is lost to follow-up. It considers all data observed relevant to the scientific question of interest. When correctly executed, it is generally the strategy most likely to preserve the assumption that censoring is non-informative. There is no noumenal censoring; only phenomenal censoring may be applied. Clarifying the scientific objective and treatment element of a treatment policy strategy as addressing the complete treatment regimen including both the randomized treatment and all subsequent therapy does not solve all difficulties. For a clinical trial's conclusion to be clinically relevant, the treatment regimen evaluated, including both the original randomized treatment and all subsequent therapy, needs to reflect expected actual clinical practice if the experimental study treatment is approved. As further discussed under hypothetical strategies below, a trial can have artificial elements not reflective of ordinary clinical care that can render inference from a treatment policy strategy to real-world practice problematic. Additional feasibility considerations should also be considered. A treatment policy strategy requires consistent, unconditional follow-up through and beyond intercurrent events until the event of interest is reached. This degree of follow-up can be challenging to obtain in practical clinical trials and may be infeasible in some cases. Patients may often wish to withdraw from a trial following treatment discontinuation, or for reasons related to treatment efficacy or safety. Some level of deviation from assumptions is inevitable in any clinical trial. However, if under trial conditions patient dropout becomes pervasive and systematic, the assumptions underlying a treatment policy strategy may simply be infeasible, and a strategy that does not require or assume such follow-up will occur in most patients may be more appropriate. Composite strategies make intercurrent events part of the variable (or endpoint), by making the event of interest a composite event that includes both the original event and the intercurrent one. This strategy is particularly useful and provides an alternative to censoring when the intercurrent event is positively correlated with, and highly informative of, the event of interest. Perhaps the best-known example in oncology is PFS, a composition of progression and death events. There may be situations where standard intercurrent events for radiological progression may be considered for composition in addition to death. For example, in a context where it is impractical to expect patients to remain on study after clinical deterioration until documented progression is observed and clinical deterioration without documented progression is common enough to strongly impact study results, it might be reasonable to consider a composite event combining clinical deterioration with documented progression and death. This balances the heterogeneity and potential bias associated with the subjectivity of clinical progression against the informative censoring bias resulting from too many patients withdrawing prior to progression being objectively documented. Other informative intercurrent events that have been suggested as candidates for composition include start of 2nd therapy 21 and treatment failure 22 . Composite strategies might be appropriate in a wide variety of other contexts. Event-free survival involves a composition of death and other events. Composite strategies might also be considered for intercurrent events that end assessments. For example, many oncology studies end clinic visits at radiological progression, with only survival follow-up afterwards. In these studies, time-to-event variables based on clinic visits effectively end at radiological progression. For such variables, if radiological progression is considered likely to be correlated with the event of interest, a composite strategy might be considered. Examples include indicators of disease worsening such as time to symptom deterioration. Hypothetical strategies ask what would have happened in a counterfactual scenario, e.g., if the intercurrent event had not occurred. Such a strategy may be particularly appropriate when the relevant intercurrent event only occurs because of clinical conditions, and would not occur in real-world clinical practice at all. Hypothetical strategies tend to remove events from the clinical question of interest. Generally speaking, events following the intercurrent events are not relevant. Hypothetical strategies represent an example where events are removed from the noumenon, the question of interest itself (i.e., by censoring after an intercurrent event, a simple hypothetical strategy is applied to the corresponding estimand). The assumption is that patients that are censored have the same hazard of experiencing an event as those not censored. However, such noumenal censoring is often informative, and therefore, where appropriate, alternative strategies should be considered. More sophisticated hypothetical approaches based on counterfactuals, such as inverse probability censoring weighting (IPCW) or rank preserving structural failure time (RPSFT), are also available. Counterfactual survival times are survival times that would have occurred if the intercurrent event had not happened, as estimated by a model 23 . These methods have been subject to criticism because of the strength of the assumptions required, particularly the assumption of no unmeasured confounding. However, they should be considered, despite their strong assumption requirements, for intercurrent events that exist solely as a result of clinical trial conditions, that is, when the real-world clinical conditions representing the trial's analytic inferential goal are counterfactual to the trial conditions. For example: • In blinded trials, when there are multiple experimental treatments being studied in the same class, patients assigned to the control arm may later enter another trial evaluating an experimental treatment in the same class. • In open-label trials, patients not assigned to the desired treatment may immediately withdraw without receiving study treatment, or otherwise withdraw earlier than would be the case in regular clinical practice. For example, in Checkmate 037, 23% of patients randomised to the control arm did not receive treatment compared with 1% randomised to active therapy 24 . Where the context of a clinical trial results in conditions and induces patient choices not likely to be repeatable in real-world post-approval clinical practice, a counterfactual hypothetical strategy may better reflect the scientific question feasibly addressable by such a trial. While the reliability of the answer provided by hypothetical strategies may be criticized, they have the advantage of addressing a clinically relevant question. 23 A while-on-treatment strategy is only concerned with what happens up to the time of the intercurrent event. In a while-on-treatment strategy, anything after the intercurrent event is considered irrelevant to the scientific question of interest. An example is palliative treatment, where clinical interest lies in evaluating alleviation of symptoms prior to death, whenever death occurs. There are two basic models used to implement a while-on-treatment strategy for time-toevent estimands, the cause-specific hazards model and the subdistribution hazards model. 25, 26 If independent causes can be assumed, for example time to cause-specific death where causality attribution can be reliably established and independence is a reasonable assumption, then a while-on-treatment strategy can be implemented through a cause-specific hazards model, which uses standard censoring and hence represents an interpretation of standard proportional hazards and related censoring models. Censoring for an intercurrent event in a cause-specific hazards model represents noumenal censoring. As discussed above, causal independence can rarely be reliably established in oncology, and time to cause-specific event estimands are accordingly not common. When the independence assumption lacks an unequivocal basis, which we suggest is often and indeed usually the case in oncology, a while-on-treatment strategy can be implemented using a competing risk method with the intercurrent event classified as a competing risk event. 27 Competing risk events are removed from the noumenon through a method other than censoring. Our forthcoming 2 nd paper discusses this and other such cases in more detail, using the more general concept of occlusion. 28 A number of additional situations in oncology are potentially amenable to a while-ontreatment strategy. For example, in evaluating time to safety events, it may be appropriate to consider the occurrence of such events up until death (a "while alive" strategy). Here anything after death becomes irrelevant to the scientific question of interest. Similarly, analyses of time to treatment-emergent adverse event, which are considered of interest only up to a specified time after treatment, could also potentially be implemented using a whileon-treatment (here while-treatment emergent) strategy. Once the treatment-emergent period ends, subsequent events are not considered relevant to the scientific question of interest, the analysis of treatment-emergent events. Given that events like death and treatment withdrawal are often not independent of the relevant safety event of interest, a competing risk strategy as an alternative approach to censoring might be an appropriate way to address this fact. The Fine-Gray subdistribution model is not without problems. It requires care in interpretation. Once a competing risk event is experienced, the event of interest cannot occur but the subject is not removed from the risk set, effectively contributing immortal time. Subdistribution hazards are not generally independent, and a subdistributions hazards model may not have a simple relationship with the cumulative incidence function (CIF) or a causal interpretation. The descriptive CIF, analogous to a Kaplan-Meier curve, may be both more comprehensible and more interpretable. 25 A principal stratum strategy 29,2 attempts to study only patients in whom the intercurrent event of interest is not expected to occur, generally using a model to predict such patients from baseline characteristics. Accordingly, the strategy addresses the intercurrent event by removing patients expected to experience it from the study population. For patients for whom the event occurs despite the model's prediction that it will not, ordinary censoring can be applied, and the censoring involved is phenomenal. Just as too high a proportion of patients with data occluded by an intercurrent event can invalidate a treatment policy strategy, too high a proportion of patients experiencing the intercurrent event, while not changing the strategy, can render the model underlying a principal stratum model nonpredictive and invalid. A principal stratum strategy has not been common in regulatory oncology clinical trials. Handling of typical kinds of oncology intercurrent events can be summarized as follows: Death. For most time-to-event endpoints used in oncology, death is included as a component of the composite endpoint and is therefore considered as a composite strategy. Time to progression, where death is not included in the endpoint (and can therefore be estimated using noumenal censoring) may also be of interest if deaths are unlikely to be related to the disease being studied, a situation not common in late-stage cancer. While truly independent deaths could be addressed as an independent-causes while-on-treatment strategy, we do not recommend this interpretation for the typical oncology case, where dependence is generally at least a possibility. We believe the approach is better interpreted as a hypothetical strategy describing what would have happened if the patient hadn't died, assuming that death is noninformative. A while on treatment strategy that does not assume non-informativity can be implemented through a competing risk approach. An example would be the competing risk approach to time to bone fracture with death as a competing risk event. 11 Discontinuation of treatment due to toxicity. Noumenal censoring at treatment discontinuation could result in a hypothetical strategy, describing what would have happened if patients had remained on treatment (i.e. had not experienced the toxicity). Following patients until the event of interest regardless of discontinuation of therapy would be a treatment policy approach. Toxicity could also be included in the event of interest in a composite strategy. Subjective clinical progression. Similar considerations apply as for discontinuations of treatment due to toxicity. Concerns about bias in the clinical assessment of progression have generally resulted in regulatory agencies recommending documented progression as a basis for clinical trial outcomes. However, in certain disease settings it may not be feasible to keep patients under observation once they experience subjective clinical progression. If patients with subjective clinical progression have an increased risk of future documented progression or death than patients without progression of any kind, then assuming that patients with subjective progression have the same future risks as patients without progression of any kind may result in underestimating their future risks, overestimating their future times to event, and hence potentially overestimating treatment effects and benefits, particularly if more patients in the treatment group have subjective progression than in the control. Thus, a treatment policy strategy may lead to informative censoring. An alternative would be to include subjective progression as an event as a composite strategy. Incorrect medication. If a patient receives the wrong treatment during a trial, the patient could be followed up according to treatment policy or the patient could be noumenal censored at the time of incorrect medication, to assess what would have happened if the correct medication had been received (hypothetical strategy). Initiation of new anti-cancer therapy. Many strategies exist for handling initiation of subsequent anti-cancer therapy and the most appropriate option may depend on the specific setting and the timing of initiation of therapy. Sometimes new therapies are part of the planned treatment intervention and as such being part of the treatment strategy attribute and should therefore not be considered intercurrent events. Subsequent therapy that does not reflect real-world clinical practice can be regarded as an intercurrent event confounding inferability to future real-world practice. A hypothetical strategy asks what would have happened if the confounding intercurrent event, the subsequent therapy, had not occurred may be appropriate. Hypothetical strategy options include noumenal censoring patients at the start of new therapy if it occurs before the event of interest; and causal inference methods such as rank-preserving structural failure time (RPSFT) or to estimate the outcome had patients not crossed over from control to active treatment; 11 and the 2-stage method to estimate the outcome had patients not crossed over at a specific disease-related time point such as progression; 31 and inverse probability censoring weighting (IPCW) to estimate the outcome in the absence of new therapy. 11 Following patients until the event of interest regardless of therapy would be a treatment policy approach whereas including the initiation of a new therapy as an event would be a composite strategy and may be considered if the initiation of therapy is thought to be correlated with outcome. See Manitz (2022) for a more detailed discussion. 23 As discussed above, certain aspects of trial conduct are not expected to be translatable to clinical practice. For example, where there are multiple therapies being developed in the same class at the same time, patients may leave one trial and enter another, typically if randomized to the control arm. This is reasonable patient behavior in a clinical trial context because of the possibility a patient assigned control therapy aims for a chance to receive the experimental therapy which potentially provides additional benefit. But this behavior will not occur in clinical practice, where there is no uncertainty about which drug is being taken. 23 Where the clinical trial context results in artificial trial-specific intercurrent events not expected to be applicable in clinical practice, resulting subsequent therapy can be regarded as an intercurrent event confounding inference to clinical practice. In this case, a counterfactual hypothetical strategy, which asks what would have happened if the confounding intercurrent event had not occurred, may be particularly appropriate. In such a context, where the trial itself does not represent clinical practice, a treatment policy strategy may enumeratively predict the outcome of a future replication of the trial very reliably, but its analytic prediction of real-world practice will be confounded. A valid regulatory clinical trial requires a reasonable prediction of real-world practice, not merely a prediction of a future repetition of trial conditions. As ICH E9 R1 notes, "usually an iterative process will be necessary to reach an estimand that is of clinical relevance for decision making, and for which a reliable estimate can be made." 1 Subsequent therapy may represent an example of a conflict between clinical relevance and reliable estimation, requiring care in selecting a scientific question and intercurrent event strategy representing a reasonable balance between the two. Surgery and stem cell transplant. Surgery may occur during oncology trials for multiple reasons. For example, surgery is a planned procedure in neoadjuvant trials, or as palliative or curative treatment in later stage disease. For an EFS endpoint, surgery could be a component of a composite strategy. Alternatively, if unplanned surgery is potentially curative, a treatment policy approach may be considered, following the patient beyond surgery until the event of interest In the case of palliative surgery, this could be handled similarly to initiation of anti-cancer therapy described above. Similar considerations apply to stem cell transplant as an intervention. 21 It may not always be possible to follow patients beyond surgery. Censoring at the time of surgery would constitute a hypothetical strategy, and would have issues similar to censoring for other subsequent therapy. Surgery is often informative of future outcome, Other situations require consideration of phenomenal censoring. Examples include incomplete follow-upand missing data for central assessment of images following progression from local review. Here, the censoring technique can be considered as a sensitivity analysis using a different estimation method rather than aligning these situations to a specific estimand as they are answering the same clinical question. Further implications of censoring resulting from independent central assessment of images are discussed in Fleischer (2010) . 32 In addition to these typical intercurrent events which occur in oncology clinical trials, there might be intercurrent events observed during the course of a trial which could not be foreseen at the design stage and which cannot be controlled by study procedures like the occurrence of a pandemic. 33 Such cases require careful re-assessment of the estimand definition. 1 The estimand definition(s) in a trial determine how long rigorous data collection is required. If the primary estimand requires collection up to a certain intercurrent event it may be necessary to continue collection beyond this event to inform other estimands. This has implications for both study design and data collection. In the estimands framework, data collection is as important as design. The chosen estimator will generally require data on intercurrent events. This may require augmentation or even redesign of standard data collection systems and procedures. Relevant data should have each assessment noted, with the assessment date, whether the assessment occurred, reason for no assessment, and a causality (relatedness to study treatment). This causality assessment could be similar to the one commonly used to determine treatment relatedness of AEs. The list of reasons for no assessment should not be limited to predefined intercurrent events. The underlying reason should be captured in sufficient granularity to identify intercurrent events not being considered relevant at the design stage. This requirement is applicable to document in a similar manner also withdrawal from study treatment, change in therapy, withdrawal or failure to perform assessments. In order to construct a time to event variable it is important to document when no data is collected including the reason. As an example, for assessing time to Adverse Event (AE):, we recommend recording the date of collection, and if an assessment was missed, documenting a reason for missed assessment. Without careful attention to ensuring the needed data can be collected, log formats commonly used for data collection may not yield this information, and can make it difficult to distinguish between a finding that no AEs occurred and a failure to perform an AE assessment. Alternative approaches to data collection might be considered when absence of documentation is informative and failure to perform assessments is a common concern. E.g., digital devices can collect physical function data to replace frequent clinic visits to complete questionnaires. Such automated data collection avoids additional burden for patients and minimizes the risk to stop follow-up prematurely. Improvements in technology have already aided data collection outside clinic visits and are likely to continue to improve. It is not uncommon in oncology clinical trials for data collection considerations based on primary variable to affect data collection for secondary variables. For example, if the primary variable is PFS, clinic visits, even when extending past subsequent therapy, will typically end at documented radiological progression. Ending clinic assessments at progression will occlude data collection for secondary estimands requiring in-clinic assessments, for example time to forced lung capacity deterioration and time to symptom improvement or deterioration. As discussed above, when the study design ends assessments at progression, simply censoring at end of assessments results in implicitly censoring for progression. This implicit noumenal censoring induces an implicit hypothetical strategy for progression, even though progression is not mentioned in the censoring table. Accordingly, where assessment ends at an event based on the needs of a different, higherpriority estimand, and the ending event may be informative with respect to the event of interest, we recommend that the study design team explicitly acknowledge the event resulting in loss to further clinic visits as an intercurrent event, and devise an appropriate strategy that is both scientifically reasonable and feasible in the context. An explicit hypothetical strategy would be preferable to leaving the effect of the assessment-ending event unacknowledged. But other strategies acknowledging the impact of the event on interpretation may be more appropriate, particularly where the relevant event is highly informative. It might, for example, be appropriate to consider a composite strategy and assess time to the earlier of symptom deterioration or progression, whichever occurs first. In some cases, the lowered priority of the estimand may reflect the fact that what happens after the assessment-ending event is not of sufficient interest to be worth study in the context of the particular clinical trial. Compromises are inevitable in clinical research. It is important, however, for the design team to first identify the clinical question of interest and understand the optimum strategy to address that question. Once that is done, the team should proceed to investigate whether the clinical conditions, estimand priority within the study, and other factors render the strategy feasible in the context. If a compromise is necessary, the team should be conscious of both what is desired and what is possible. It should understand the limitations compromise imposes on ability to address the original question, including understanding how changing the strategy changes the research question, and how design features can implicitly change the strategy. The Estimands Guidance notes that "usually an iterative process will be necessary to reach an estimand that is of clinical relevance for decision making, and for which a reliable estimate can be made." Figure 1 illustrates a proposal for an iterative approach to construct an estimand that is both of clinical relevance for decision making and operationally feasible at the design stage. This iterative approach is similar to Deming's Plan-Do-Study-Act cycle. 34 Figure 1 : Illustration of an iterative approach to construct an estimand that is of clinical relevance for decision making and operationally feasible. [ Figure 1 here] Every strategy is based on assumptions and requires conditions to be valid. The purpose of sensitivity analyses is to check, to the extent feasible, whether the applicable assumptions required for the primary estimand are reasonable under the circumstances. The discussion here focuses on sensitivity analyses, as defined in ICE E9 addendum, related to noumenal and phenomenal censoring. Sensitivity analyses are a particular problem in a survival context. Events that occur after censoring are not always documented and it is not possible to know whether censoring is informative. Similarly, assumptions needed for a hypothetical analysis are often unverifiable. Nonetheless standard censoring checks and sensitivity analyses are possible, and commonly employed. They include: • Assessing the distribution of censoring to see if it occurs evenly between the arms. • Analyses with and without censoring of events that are observed following more than one missed visit. • Checks for key traditional statistical assumptions such as proportional hazards • Interval censored analyses As discussed above, a common issue in survival trials is the use of treatment policy strategies where the feasibility of systematically following patients beyond the intercurrent event of concern may be questionable. We recommend sensitivity analyses to address this assumption. This could be done with simple descriptive methods. For example, for PFS studies with a treatment policy strategy for events like end of treatment, clinical progression, or change of therapy without radiological progression, we recommend identifying what proportion of patients had these events prior to progression, and what proportion did not receive further tumor assessments beyond these events. Large differences in the events between arms could lead to concern. For these analyses, some care may be required to differentiate patients whose loss to assessment follow-up was clearly due to other, e.g. unambiguously terminal events. For example, patients who died shortly after change of therapy could be assumed to have received full follow-up regardless of the change. Supplementary analyses, by contrast, target a different estimand and provide additional insights into the treatment effect. For example, these could include exploration of the components of a composite endpoint and compare results from local and central tumour assessments. This manuscript highlights that noumenal censoring should be considered as a method to clearly define the estimand and thus this censoring approach needs to align with the estimand attributes including population, variable, treatment strategy, population level summary. Different strategies to address intercurrent events along with the aligned censoring approach (where applicable) are given in Table 1 . Many of these recommended practices were employed prior to the release of the addendum, without the explicit link between the estimand and the data collected. Phenomenal censoring at the date of last assessment. In addition, the following practices should be considered: • Ensuring that strategy is aligned with protocol design and visit schedule. Treatment policy strategies require systematic follow-up through and beyond the relevant intercurrent event. • Ensuring that implementing the strategy, including the needed data collection and visit schedule, will be feasible and appropriate in the study context. • Identifying all potentially informative events that systematically result in stopping data collection or removing subsequent data from the analysis and developing an appropriate strategy to address them. This includes events implicit in the study design, visit schedule, withdrawal criteria, etc. • Identifying, defining and minimizing missing data. Ensuring appropriate data collection and addressing intercurrent events requires long term planning, eCRF design and careful trial monitoring. Where feasible, consider collecting data beyond the intercurrent event to enable further sensitivity or supplementary analyses. • Collect data on all relevant intercurrent events (treatment withdrawal, change in therapy, withdrawal from assessments, etc.), including reasons and investigator's assessment of treatment-relatedness. • Sensitivity analyses should be used to evaluate whether the assumptions behind the strategy chosen were reasonable under the circumstances • Another strategy should be considered where the assumptions underlying the treatment policy may not be met, e.g.: a) Single-arm trials b) Trials where it is considered unethical or otherwise infeasible to follow patients beyond an intercurrent event expected to be frequent enough to substantially affect results c) Trials where an assumption that an applicable intercurrent event does not depend on treatment arm cannot be met (e.g. subjective assessments in open-label studies) d) Trials where a change in treatment is part of the design, e.g. rescue therapy e) Trials where intercurrent events are an artifact of trial circumstances and not expected to reflect (i.e. are expected to be counterfactual to) clinical practice (e.g. treatment switching to another experimental treatment in the same class). In such cases a different estimand (based on the composite, hypothetical, or while-ontreatment strategy) might be considered, depending on the clinical question of interest. A treatment policy strategy might still be the best available option under the circumstances despite not being ideal. An iterative approach, analogous to Figure 1 , is suggested. non-proportional (e.g. delayed effects), the estimates can depend on the point at which the trial is ended, even if the censoring involved is uninformative. In the past, the assumption of non-informative censoring has rarely been challenged by regulatory authorities. Except in special cases like the pre-planned maturation of the trial, censoring is often informative. Its traditional widespread use to estimate in the presence of intercurrent events has often ignored the potential to bias results. The estimand framework addresses this issue because the concept of intercurrent events closely resembles the survival analysis concept of informative censoring. Hence the framework strategies for identifying and addressing intercurrent events provide methods for handling situations where the noninformative censoring assumption does not apply. Using the terminology of Kant, 35 the estimand could be considered the population-level noumenon which is to be estimated and inferred to in terms of the patient-level phenomenon being observed in a clinical trial. A key distinction between noumenal and phenomenal censoring is that noumenal censoring is inconsistent with a treatment policy strategy. When noumenal censoring occurs, it generally results in a hypothetical strategy. Phenomenal censoring, on the other hand, does not change the strategy from a treatment policy one. Whether censoring is noumenal or phenomenal, the non-informativity assumption underlying the use of censoring must be valid for the chosen strategy to be validly interpretable. Censoring for intercurrent events is often informative, and the validity of the noninformativity assumption should be checked. In an estimand setting, censoring is often replaced by another implementation mechanism. For example, composite strategies will handle events as a component of the event of interest; while-on-treatment strategies may handle them as a competing risk event; hypothetical strategies may implement a causal inference model; and so on. Alternatives to noumenal censoring may be particularly appropriate. Recommended strategies are proposed to address common intercurrent events. In general, the treatment policy strategy, regardless of intercurrent events, has become the default standard and is recommended for randomised pivotal clinical trials where data can be consistently and systematically collected until the event of interest or study termination. From an estimands perspective, the treatment policy strategy reflects the entire treatment regimen, including subsequent therapies. Other strategies such as composite, hypothetical, or while-on-treatment strategy might be considered when the question of interest differs or where the assumptions underlying the treatment policy are not met. A treatment policy strategy will not insulate the trial from confounding, and events such as subsequent therapy that are inconsistent with realworld practice might better be handled as confounding intercurrent events than treated as censorable. It is important to recognize that changing the strategy and the handling of intercurrent events changes the estimand and its interpretation. Different approaches are necessary to address different clinical questions of interest. As discussed in Section 6 the approach to study design is an iterative process to ensure alignment of the clinical question of interested with the estimand strategy. Implicit noumenal censoring, which induces an implicit hypothetical strategy, should be checked for in every study design. Events which terminate assessments or trigger withdrawal criteria, such as progression in many studies, are particular candidates. When the study design systematically stops assessments at a particular event, the normal catch-all censoring at end of assessments event in fact systematically censors for the assessment-terminating event, and this noumenal censoring results in an implicit hypothetical strategy. We recommend making all such cases explicit, and specifying the appropriate strategy to use. The visit schedule and criteria for withdrawing from or completing study phases that result in ending particular assessments should be carefully checked in this regard, and the visit schedule and withdrawal criteria should be designed in tandem with the relevant estimands, with the impact of each on the other considered. The purpose of a clinical trial is often to predict real-world clinical practice, especially so for a registrational trial. Certain elements of a clinical trial, such as randomization, are not reflective of real-world practice, and may induce patent behaviour not replicable in the real world. Where this occurs, a treatment policy strategy might not be the most appropriate. A counterfactual hypothetical strategy, which asks what would have happened if the non-realworld behaviour had not occurred, might be more relevant to real-world practice and hence might sometimes be preferable, despite problems with establishing the reliability of causal inference methods. Rigorous data collection and trial monitoring are the key to addressing and distinguishing missing data and intercurrent events. The estimands framework depends on good data collection, including data about missed assessments. Events that are not collected cannot be managed or addressed. It is therefore critical to obtain the reasons for and dates of withdrawals and losses to follow-up, and to obtain data permitting assessment of the existence and dates of underlying intercurrent events. While not described within the estimand guidance, one could consider framing the concept of censoring in the broader context of occluding events, with occlusion representing any loss to further follow-up and/or removal of further collected data from analysis. Occlusion constitutes a broader concept than the estimand guidance's "intercurrent event" and "terminal event" and one appropriate to addressing a broader set of time-to-event methods in an estimands context. The subject of occlusion is described in greater detail in our forthcoming 2 nd paper. 28 The estimands framework is not simply new language to describe conventional practices. It requires a rethinking. We anticipate it will have impact on the data collection and interpretation of most if not all oncology studies with time-to-event analyses. No new data is presented in this manuscript. This manuscript is based solely on previously published results. 12. E9(R) -Addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials Principal stratum strategy: Potential role in drug development Kant's transcendental idealism. Stanford Encyclopedia of Philosophy Clinical Trial Endpoints for the Approval of Non-Small Cell Lung Cancer Drugs and Biologics: Guidance for Industry Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics: Guidance for Industry Guideline on the evaluation of anticancer medicinal products in man. Methodological consideration for using progression-free survival (PFS) or disease-free survival (DFS) in confirmatory trials Analysis of progression-free survival in oncology trials: Some common statistical issues General right censoring and its impact on the analysis of survival data Presenting censored survival data when censoring and survival times may not be independent Bounds on net survival probabilities for dependent competing risks Correcting for noncompliance and dependent censoring in an AIDS clinical trial with Inverse Probability of Censoring Weighted (IPCW) log-rank tests Competing risks analysis of patients with osteosarcoma: a comparison of four different approaches Censoring issues in survival analysis The role of censoring on progression free survival: Oncologist discretion advised GRADE Guidelines: 29. Rating the certainty in time-to-event outcomes -Study limitations due to censoring of participants with missing data in intervention studies Informative censoring -a neglected cause of bias in oncology trials Treatment effect quantification for time-to-event endpoints-Estimands, analysis strategies, and beyond Progress in cancer survival, mortality, and incidence in seven highincome countries 1995-2014 (ICBP SURVMARK-2): a population-based study Revised recommendations of the International Working Group for Diagnosis, Standardization of Response Criteria, Treatment Outcomes, and Reporting Standards for Therapeutic Trials in Acute Myeloid Leukemia Acute Myeloid Leukemia: Developing Drugs and Biological Products for Treatment. Draft Guidance for Industry Estimands in hematologic oncology Inappropriate censoring in Kaplan-Meier analyses Estimands in clinical trials with treatment switching in oncology Nivolumab versus chemotherapy in patients with advanced melanoma who progressed after anti-CTLA-4 treatment (CheckMate 037): a randomised, controlled, open-label, phase 3 trial Practical recommendations for reporting Fine-Gray model analyses for competing risk data On estimands and the analysis of adverse events in the presence of varying follow-up times within the benefit assessment of therapies A proportional hazards model for the subdistribution of a competing risk The role of occlusion: Potential extension of the ICH E9 (R1) addendum on estimands and sensitivity analysis for time-to-event oncology studies Principal stratication in causal inference Addition of radium-223 to abiraterone acetate and prednisone or prednisolone in patients with castration-resistant prostate cancer and bone metastases (ERA 223): a randomised, double-blind, placebo-controlled, phase 3 trial Adjusting for treatment switching in randomised controlled trials -a simulation study and a simplified two-stage method How is retrospective independent review influenced by investigator-introduced informative censoring: a quantitative approach Assessing the Impact of COVID-19 on the Clinical Trial Objective and analysis of oncology clinical trials -application of the estimand framework Out of the Crisis Kritik der reinen Vernunft