key: cord-0021291-1n2xoi5q authors: Hou, N.; Wu, J.; Xiao, J.; Wang, Z.; Song, Z.; Ke, Z.; Wang, R.; Wei, M.; Xu, M.; Wei, J.; Qian, X.; Xu, X.; Yi, J.; Wang, T.; Zhang, J.; Li, N.; Fan, J.; Hou, G.; Wang, Y.; Wang, Z.; Ling, R. title: Development, verification, and comparison of a risk stratification model integrating residual cancer burden to predict individual prognosis in early-stage breast cancer treated with neoadjuvant therapy date: 2021-09-16 journal: ESMO Open DOI: 10.1016/j.esmoop.2021.100269 sha: 608a81135cf4aead023ff933a40c30616dc45da2 doc_id: 21291 cord_uid: 1n2xoi5q BACKGROUND: A favorable model for predicting disease-free survival (DFS) and stratifying prognostic risk in breast cancer (BC) treated with neoadjuvant chemotherapy (NAC) is lacking. The aim of the current study was to formulate an excellent model specially for predicting prognosis in these patients. PATIENTS AND METHODS: Between January 2012 and December 2015, 749 early-stage BC patients who received NAC in Xijing hospital were included. Patients were randomly assigned to a training cohort (n = 563) and an independent cohort (n = 186). A prognostic model was created and subsequently validated. Predictive performance and discrimination were further measured and compared with other models. RESULTS: Clinical American Joint Committee on Cancer stage, grade, estrogen receptor expression, human epidermal growth factor receptor 2 (HER2) status and treatment, Ki-67 expression, lymphovascular invasion, and residual cancer burden were identified as independent prognostic variables for BC treated with NAC. The C-index of the model consistently outperformed other available models as well as single independent factors with 0.78, 0.80, 0.75, 0.82, and 0.77 in the training cohort, independent cohort, luminal BC, HER2-positive BC, and triple-negative BC, respectively. With the optimal cut-off values (280 and 360) selected by X-tile, patients were categorized as low-risk (total points ≤280), moderate-risk (280 < total points ≤ 360), and high-risk (total points >360) groups presenting significantly different 5-year DFS of 89.9%, 56.9%, and 27.7%, respectively. CONCLUSIONS: In patients with BC, the first model including residual cancer burden index was demonstrated to predict the survival of individuals with favorable performance and discrimination. Furthermore, the risk stratification generated by it could determine the risk level of recurrence in whole early-stage BC cohort and subtype-specific cohorts, help tailor personalized intensive treatment, and select comparable study cohort in clinical trials. Breast cancer (BC) is the most leading malignancy, and its mortality rate ranks second among all cancer-related deaths in women. 1 Neoadjuvant chemotherapy (NAC) has become an established treatment option for locally advanced BC to reduce tumor size and to increase the breast conservation rate. 2,3 BC patients received NAC shows a great heterogeneity of disease with a significantly different survival, 4, 5 rendering it critical to build up a risk stratification model. To date, existing models such as Rouzier model, 6 Clinical-Pathologic Scoring System incorporating estrogen receptor (ER)-negative disease and nuclear grade 3 tumor pathology (CPS þ EG scoring systems), 7 Colleoni model, 8 neoadjuvant response index (NRI), 9 Keam model, 10 Nottingham Clinico-Pathological Response Index (NPRI), 11 and Neo-Bioscore 12 have been developed, validated, and extensively used in the clinic. [13] [14] [15] However, almost all these studies are seemly dated because human epidermal growth factor receptor 2 (HER2)-targeted therapies and informative pathological measurement were not taken into consideration in that case. [6] [7] [8] [9] [10] [11] [12] 16, 17 A nomogram created for the population in the recent clinical guidelines is therefore urgently required and comparisons of the novel model with previous models are crucial. Basic and predictive parameters for formulating a model specific to patients with BC encompass clinical stage, histology type, grade, the expression of ER, progesterone receptor (PR), Ki-67, and HER2, residual tumor burden [including pathological complete remission (pCR) status, pathological AJCC stage], and lymphovascular invasion (LVI). [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] Among these indicators, residual tumor appears especially important not only because it shows a negative correlation with prognosis but also its cancer burden ranges from single cancer cell to large diameters of tumor in these patients, which, in part, could lead to a great degree of heterogeneity in BC population after NAC. Several randomized studies and meta-analysis demonstrated that patients achieving pCR during the NAC have longer diseasefree survival (DFS) and overall survival (OS) than those with residual cancer. [23] [24] [25] It is for this reason that almost all prior studies have only focused on patients with pCR or residual cancer. However, since the measurement criteria of residual cancer burden index (RCB) were proposed by Symmans et al. in 2007, 26 RCB score and class were highly sought-after in many studies. 4, [27] [28] [29] [30] [31] RCB is based on the measurement of histopathological indicators such as number of positive nodes, diameter of the largest metastatic node, and the size and percent cellularity of the primary tumor bed (www.mdanderson.org/breastcancer_RCB). Recent findings showed that no prognostic difference for OS or relapse-free survival was observed between pCR and low residual cancer burden class (I), 27, 28, 31 which provides rationale to establish a predictive model for all patients (including pCR and non-pCR) who received NAC. However, there is currently no available prognostic model that can be applied for all early-stage BC patients treated with NAC. On the other hand, individualized prediction has been regarded as an important requisite for an excellent predictive model, [32] [33] [34] considering that relapse is a frequent cause of death in BC and substantially affects the quality of life for BC patients. More importantly, prognostic model with excellent performance of discrimination can help develop new drugs for population in the high-risk group. 35 Therefore, it is crucial for clinicians to construct a novel model to accurately stratify the risk level of recurrence. In the present study, we sought to create a risk stratification model that can represent continuous prognostic risk and can be applied to predict individual prognosis and separate patients into different risk groups in BC with NAC. In the current study, we collected the clinical and pathological data of 876 patients who were diagnosed with BC and received NAC at Xijing Hospital in 2012-2015. The following data were selected as study variables: age at diagnosis, family history of BC, menopausal status, clinical AJCC stage, laterality, histology, pathological information, and treatment data. Family history of BC refers to a patient's firstor second-degree relatives with BC. The inclusion criteria were listed as follows: (i) female BC was diagnosed by positive histology; (ii) all patients were given NAC. The exclusion criteria were as follows: (i) bilateral BC; (ii) metastatic BC; (iii) information on ER was unavailable; (iv) not received mastectomy; (v) incomplete follow-up; (vi) patients with unavailable pathology slides. Finally, 749 patients were enrolled into the study. The eligible population was randomly assigned in a 3 : 1 ratio to a training cohort and an independent cohort by computergenerated randomized number. The flowchart of the study is shown in Figure 1A . The clinical stage was assessed based on the sixth edition of the AJCC BC staging system before NAC. 36 The detailed NAC regime and dose were described in our prior reports. 37 The study was approved by the institutional review board of Xijing hospital and received an exemption of informed consent from the local ethics committee. The primary endpoint of our study was DFS. The secondary endpoints of this study were OS, local regional recurrence (LRR), and distant metastasis (DM). DFS was measured from the date of mastectomy to the date of disease recurrence, death, or the last follow-up. OS was defined as the time from the date of mastectomy to the date of death due to any cause or the last follow-up. LRR was defined as first cancer recurrence in the regional areas (including axillary, supraclavicular, internal mammary nodes, and chest wall). DM was identified as first recurrence beyond the LRR area as defined above. Patients with local and distant recurrence were analyzed in LRR and DM, respectively. All patients underwent a pathological evaluation at the Department of Pathology. Formalin-fixed paraffin-embedded post-NAC resection specimens were retrieved from the Pathology Department Archive. The assessment of the punctual specimens before NAC was utilized to determine the histology type of BC and grade. Considering that patients with pCR have no residual cancer burden, we carried out immunohistochemistry evaluation for ER, PR, HER2, and Ki-67 based on the pathology slides obtained before NAC. Only nuclear reactivity was considered for ER, PR, and Ki-67 expression, using continuous expression. HER2/neu positive means that the number of cells with complete membrane staining >10% of the total tumor cells or HER2 amplification determined by FISH gene detection. Primary tumor bed, overall cancer cellularity (as percentage of area), percentage of cancer that is in situ disease, number of positive lymph nodes, and diameter of largest metastasis of resected lymph node after NAC were pathologically evaluated on hematoxylineeosin-stained slides according to the criteria reported by Symmans et al. 26 N. Hou et al. Figure 1B and C shows the measurement of bidimensional diameters of the primary tumor bed (d1, d2) and diameter of the largest nodal metastasis. RCB was assessed and divided into RCB classes. LVI and skin involved were carefully measured based on post-NAC pathological slides. The pathology slides were measured independently by the three pathologists of the authors. Chi-square test or Fisher's exact test was utilized to compare the categorical variables between the training cohort and independent cohort, whereas quantitative variables were listed as median with interquartile range (IQR) and statistical comparisons were made using Student's ttest or non-parametric ManneWhitney U test. Patients who were alive at last follow-up data (15 September 2020) or lost to follow-up were censored at the time of the last contact. The KaplaneMeier analysis was employed to calculate the survival rate, and the log-rank test was used to compare the differences between the curves. Variables with P <0.05 in univariable Cox analysis were incorporated into multivariable Cox analysis to generate independent prognostic factors of BC in the training cohort. A backward step-down selection identified a final model according to the Akaike information criterion. 38 A nomogram for predicting DFS was built up based on the multivariable Cox results. In the current study, the bootstrap method ¼ 200 was applied to validate the model's performance. The 3-year and 5-year receiver operating characteristic (ROC) curves and calibration curves were drawn to evaluate the predictive performance of the nomogram in the internal and independent validation. 39 The larger the area under the ROC curve (AUC), the higher the predictive accuracy of the nomogram. The closer the calibration curve is to the ideal curve, the more unbiased the predictive performance of the model. The time-dependent ROC curves, corresponding AUC values, and Harrell's C-indexes were utilized to measure and compare the performance of the final model, Rouzier model, CPS þ EG scoring system, Colleoni model, Keam model, Neo-Bioscore, RCB, AJCC, and Ki-67 39 among the training cohort, independent cohort, luminal BC, HER2-positive BC, and triple-negative BC (TNBC) groups. Decision curve analysis was carried out to identify whether the nomograms could be deemed useful tools for clinical decision making by comparing the net benefits at any threshold probability. 40 All patients were grouped into three risk stratums (low-risk, moderate-risk, and high-risk group) according to two optimal cut-offs identified by Xtile in the training cohort. 41 Statistical tests were two-sided, and P values <0.05 were considered as statistically significant. All statistical analyses were conducted using R version 3.3.6 (http://www.R-project. org/) with packages rms, timeROC, caret, and ggDCA. A total of 749 BC patients who received NAC were finally selected. Among them, 563 (75.2%) patients were included in the training cohort and 186 (24.8%) patients were included in the independent cohort ( Figure 1A ). Baseline characteristics between the two cohorts are presented in Supplementary Table S1, available Table S3 , available at https://doi.org/10. 1016/j.esmoop.2021.100269). The 1-year, 2-year, 3-year, 4year, 5-year, and 6-year AUROC values and C-indexes of the current nomogram were higher than those of other models and factors in the training cohort and independent cohort, indicating a favorable performance and discrimination ( Figure 3A and B) . The 3-year and 5-year decision curve analysis indicated that the net benefit of the nomogram robustly outperformed other models and single factors in the training cohort ( Figure 3C and D) and independent cohort ( Figure 3E and F). We utilized X-tile software to generate two optimal cut-offs (280 and 360, Supplementary Figure S2A -C, available at https://doi.org/10.1016/j.esmoop.2021.100269), which divided BC into three groups with a highly significantly different probability of recurrence ( Figure 2A ): low risk (total points 280, n ¼ 334 in the training cohort, and n ¼ 102 in the independent cohort), moderate risk (280 < total points 360, n ¼ 167 in the training cohort, and n ¼ 65 in the independent cohort), and high risk (total points >360, n ¼ 62 in the training cohort, and n ¼ 19 in the independent cohort). In the entire population, the 5-year DFS of low-risk, moderate-risk, and high-risk groups was 89.9% (95% CI, 87.0% to 92.8%), 56.9% (95% CI, 50.3% to 63.5%), and 27.7% (95% CI, 17.8% to 37.6%), respectively. With the low-risk group as reference, the hazard ratios (HRs) for moderate-risk and high-risk groups were 5.56 (95% CI, 3.91-7.91; P < 0.001) and 13.32 (95% CI, 9.01-19.71; P < 0.001), respectively ( Figure 4A ). Similar trends were also observed in the training cohort and independent cohort. The cumulative incidence curves for recurrence and OS curves were significantly different among three groups in the training cohort and independent cohort (all with logrank P < 0.001, Figure 4B -E). The cumulative incidence curves for LRR and DM of three groups are shown in Supplementary Figure S3 , available at https://doi.org/10. 1016/j.esmoop.2021.100269, and were significantly different (log-rank P < 0.001). The performance of risk stratification model in subtypespecific BC cohorts. The performance and discrimination of the model and other models as well as some factors were compared (Supplementary Table S4 , available at https://doi.org/10.1016/j.esmoop.2021.100269). The AUROC values and C-indexes of the risk stratification model were higher than those of other models and factors in the luminal BC cohort, HER2-positive BC cohort, and TNBC cohort, which indicated good discrimination performance ( Figure 5A-C) . The 3-year and 5-year calibration curves presented good agreement between predictions and observations in the probability of 3-year and 5-year DFS among subtype-specific cohorts ( Figure 5D -F). The cumulative incidence curves for recurrence and OS curves were distinctly different in the luminal BC cohort, HER2-positive BC cohort, and TNBC cohort (all with log-rank P < 0.001, Figure 5G -L). There was also statistically significant difference among cumulative incidence curves for LRR and DM in the three groups (both log-rank P < 0.001, Supplementary Figure S4 Figure 5 . Time-dependent AUC values of the current model, other available models, and single independent predictors in subtype-specific BC cohorts (A-C); the calibration curves of nomogram for predicting 3-year DFS and 5-year DFS in subtype-specific BC cohorts (D-F); the cumulative incidence of recurrence curves of three risk stratums in subtype-specific BC cohorts (G-I); and overall survival curves of three risk stratums in subtype-specific BC cohorts (J-L). AUC, area under receiver operating characteristic curve; BC, breast cancer; DFS, disease-free survival. substantiating the significant implication of individualized estimation and risk stratification in clinical trial design and practice. AJCC stage, grade, ER expression, HER2 status and treatment, Ki-67 expression, and LVI have been determined as independent predictors in BC patients who received NAC. [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] T stage and N stage, which indicate primary and metastatic lymph node burden, respectively, have been seen as predictors in several studies. 15, 18 The model we constructed incorporated AJCC stage, which takes together these two factors. Specially, ER expression was utilized as a continuous variable similar to several pivotal studies and LVI was classified as no, focal, and diffused by experienced pathologists, 8, 43 which might be more superior than two classifications. Furthermore, several studies consistently confirmed that RCB is the important prognostic indicator and outperforms other predictors. The robust predictor was included in the present study, which partly contributes to the favorable performance of the model. Additionally, it seems a bit paradoxical that post-mastectomy radiotherapy (PMRT) appears as an adverse indicator for prognosis in univariable regression analysis, yet shows no significant difference in the multivariable analysis. This is likely due to the fact that patients with PMRT have a higher proportion of high-risk group than that of non-PMRT group (15.4% versus 5.9% of high-risk group in the PMRT cohort and non-PMRT cohort, respectively, P < 0.001). The clinical guideline of PMRT in BC also strongly substantiates this point. 44 The C-index values of the novel model for predicting prognosis are superior to those previously reported in other models (0.73 and 0.67 of training and test cohorts in Colleoni model, respectively; 0.71 and 0.72 of training and test cohorts in Rouzier model, respectively; 0.78 of training in Keam model with only 2 years of follow-up). The possible explanations are that we considered RCB as a predictor and identified rational criteria for the target population in the current study. 45 Recently, Laas et al. reported that Neo-Bioscore had better performance in the overall population compared with RCB, but that RCB showed better performance in subtype-specific groups, especially for luminal BC and TNBC. 16 However, a study conducted by Dana-Farber Cancer Institute showed that RCB is superior to the Neo-Bioscore for stratifying patients into different survival outcomes. 17 Intriguingly, the AUROC values of five models applied to the patients in our center were inferior to those of the model. This is likely because of the distribution difference of heterogeneity in different races, the discrepancy of NAC therapeutic guidelines in different countries, and the predictive improvement of our model after including RCB and LVI. 16, 17 The optimal target population for NAC has been explored and determined in several randomized clinical trials, but verification regarding its prognostic difference is deficient. This may produce unrecognized confounding which might potentially affect the results of these studies, therefore a risk stratification model for discriminating and diminishing heterogeneity is urgently needed. 46 In particular, as is presented in Figure 5 and Supplementary Figures S3, S4 , and S6, available at https://doi.org/10.1016/j.esmoop. 2021.100269, the model can separate BC patients who received NAC into different risk stratums. Especially, median DFS of the high-risk group for BC patients who received NAC is <2 years, which would severely affect the quality of life of patients in this cohort. Fortunately, the model could help clinicians to select potential high-risk cohort of recurrence and conduct clinical trials to decrease the incidence of relapse. There are several limitations in the study: (i) The study was limited by its selective bias in retrospective study, yet this shortcoming has been diminished in identifying population based on strict criteria. (ii) Several studies have identified tumor-infiltrating lymphocytes on residual disease after NAC as an independent prognostic predictor, 27, 40 but tumor-infiltrating lymphocytes on residual disease were not evaluated in our study. (iii) Fine-needle aspiration for metastatic axillary nodes before NAC and/or evaluation of fibrosis status in resected specimens were not routinely carried out. We were unable to compare the performance of the risk stratification nomogram with other models (e.g. NRI and NPRI), while the model was compared with most predictive models using the data in our center. (iv) Although our team considered HER2 status and treatment, not all HER2þ BC patients received HER2-targeted therapy. In fact, there do exist cases, where a predictive prognostic estimation of an HER2þ without trastuzumab BC patient is required, particularly in the developing countries. The risk stratification model we developed can also be applied in this population. (v) The newest National Comprehensive Cancer Network guidelines recommend six to eight cycles of NAC in BC patients (https://www.nccn.org/patients/guide lines/cancers.aspx). However, the study population received a median of four cycles of NAC (range, two to eight cycles) according to the Chinese Anti-Cancer Association guidelines for the treatment of BC in 2012-2015. Shorter cycle of NAC compared to the current guidelines may have an effect on the precise assessment of RCB. Given these drawbacks, the application of risk stratification nomogram needs to be further verified. In conclusion, deriving from a large sample of 749 patients, the prognostic model established is the first risk stratification nomogram for predicting individual prognosis. With excellent performance and discriminative ability, the model can divide patients treated with NAC into three stratums with obviously different survival both in overall cohort and subtype-specific groups. Therefore, the risk stratification nomogram might be useful for estimating potential high-risk population of BC treated with NAC and identifying comparable candidates in clinical trials. Further verification in patients of different races remains urgently needed. The authors have declared no conflicts of interest. Data are available upon reasonable request. The data that support the findings of this study are available on request from the corresponding author, RL. Use of neoadjuvant chemotherapy for patients with stage I to III breast cancer in the United States Neoadjuvant therapy as a platform for drug development and approval in breast cancer Comparison of residual cancer burden, American Joint Committee on Cancer staging and pathologic complete response in breast cancer after neoadjuvant chemotherapy: results from the I-SPY 1 TRIAL (CALGB 150007/150012; ACRIN 6657) Validation of residual proliferative cancer burden as a predictor of long-term outcome following neoadjuvant chemotherapy in patients with hormone receptor-positive/human epidermal growth receptor 2-negative breast cancer Nomograms to predict pathologic complete response and metastasis-free survival after preoperative chemotherapy for breast cancer Combined use of clinical and pathologic staging variables to define outcomes for breast cancer patients treated with neoadjuvant therapy A risk score to predict disease-free survival in patients not achieving a pathological complete remission after preoperative chemotherapy for breast cancer A simple system for grading the response of breast cancer to neoadjuvant chemotherapy Nomogram predicting clinical outcomes in breast cancer patients treated with neoadjuvant chemotherapy Nottingham clinico-pathological response index (NPRI) after neoadjuvant chemotherapy (Neo-ACT) accurately predicts clinical outcome in locally advanced breast cancer The neo-bioscore update for staging breast cancer treated with neoadjuvant chemotherapy Validation of the CPSþEG and Neo-Bioscore staging systems after preoperative systemic therapy for breast cancer in a single center Validation of a novel staging system for disease-specific survival in patients with breast cancer treated with neoadjuvant chemotherapy Predictors of recurrence in breast cancer patients with a pathologic complete response after neoadjuvant chemotherapy Determination of breast cancer prognosis after neoadjuvant chemotherapy: comparison of Residual Cancer Burden (RCB) and Neo-Bioscore Comparison of breast cancer staging systems after neoadjuvant chemotherapy A prognostic model based on nodal status and Ki-67 predicts the risk of recurrence and death in breast cancer patients with residual disease after preoperative chemotherapy Lymphovascular invasion after neoadjuvant chemotherapy is strongly associated with poor prognosis in breast carcinoma Size of residual lymph node metastasis after neoadjuvant chemotherapy in locally advanced breast cancer patients is prognostic Ki67, chemotherapy response, and prognosis in breast cancer patients receiving neoadjuvant treatment Predictors of local-regional recurrence after neoadjuvant chemotherapy and mastectomy without radiation Prognostic value of pathologic complete response after primary chemotherapy in relation to hormone receptor status and other factors Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis Association of pathologic complete response to neoadjuvant therapy in HER2-positive breast cancer with long-term outcomes Measurement of residual breast cancer burden to predict survival after neoadjuvant chemotherapy Long-term prognostic risk after neoadjuvant chemotherapy associated with residual cancer burden and breast cancer subtype Assessment of pathologic response and long-term outcome in locally advanced breast cancers after neoadjuvant chemotherapy: comparison of pathologic classification systems Prognostic implications of residual disease tumor-infiltrating lymphocytes and residual cancer burden in triple-negative breast cancer patients after neoadjuvant chemotherapy Prognostic value of residual disease after neoadjuvant therapy in HER2-positive breast cancer evaluated by residual cancer burden, neoadjuvant response index, and Neo-Bioscore Standardization of pathologic evaluation and reporting of postneoadjuvant specimens in clinical trials of breast cancer: recommendations from an international working group Prognostic nomogram based on the metastatic lymph node ratio for gastric neuroendocrine tumour: SEER database analysis Risk stratification based on CLIF consortium acute decompensation score in patients with Child-Pugh B cirrhosis and acute variceal bleeding Clinical scoring system for the prediction of survival of patients with advanced gastric cancer Residual disease after neoadjuvant therapyddeveloping drugs for high-risk early breast cancer American Joint Committee on Cancer tumor-node-metastasis stage after neoadjuvant chemotherapy and breast cancer outcome Development and validation of a nomogram for individually predicting pathologic complete remission after preoperative chemotherapy in Chinese breast cancer: a population-based study Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors Estimating and comparing time-dependent areas under receiver operating characteristic curves for censored event times with competing risks Decision curve analysis: a novel method for evaluating prediction models X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization Tumour-infiltrating lymphocytes and prognosis in different subtypes of breast cancer: a pooled analysis of 3771 patients treated with neoadjuvant therapy The role of quantitative estrogen receptor status in predicting tumor response at surgery in breast cancer patients treated with neoadjuvant chemotherapy American Society for Radiation Oncology, and Society of Surgical Oncology Focused Guideline Update Development of a prognostic score for recommended TACE candidates with hepatocellular carcinoma: a multicentre observational study ESMO management and treatment adapted recommendations in the COVID-19 era: breast Cancer The authors thank Prof. Jie-Lai Xia, Department of Health Statistics, Fourth Military Medical University, Xi'an, China, for his valuable statistical support.