key: cord-0026939-n8xs4v9o authors: Morooka, Hikaru; Tanaka, Akihito; Inaguma, Daijo; Maruyama, Shoichi title: Clustering phosphate and iron-related markers and prognosis in dialysis patients date: 2021-10-14 journal: Clin Kidney J DOI: 10.1093/ckj/sfab207 sha: 985a198b0edd2ff0d200eba3f58fc4a62927b5b3 doc_id: 26939 cord_uid: n8xs4v9o BACKGROUND: Hyperphosphatemia in patients undergoing dialysis is common and is associated with mortality. Recently, the link between phosphate metabolism and iron dynamics has received increasing attention. However, the association between this relationship and prognosis remains largely unexplored. METHODS: We conducted an observational study of patients who initiated dialysis in the 17 centers participating in the Aichi Cohort Study of the Prognosis in Patients Newly Initiated into Dialysis. Data were available on sex, age, use of phosphate binder, drug history, medical history and laboratory data. After excluding patients with missing values of phosphate, hemoglobin, ferritin and transferrin saturation, we used the Gaussian mixture model to divide the cohort into clusters based on phosphate, hemoglobin, logarithmic ferritin and transferrin saturation. We investigated the prognosis of patients in these clusters. The primary outcome was all-cause death. In each cluster, the prognostic impact of phosphate binder was also studied. RESULTS: The study included 1175 patients with chronic kidney disease who initiated dialysis between October 2011 and September 2013. Among them, 785 were men and 390 were women, with a mean ± SD age of 67.9 ± 13.0 years. The patients were divided into three clusters, and mortality was higher in cluster c than in cluster a (P = 0.005). Moreover, the use of phosphate binders was associated with a lower risk of all-cause death in two clusters (a and c) that were characterized by older age and higher prevalence of diabetes mellitus, among other things. CONCLUSIONS: We used an unsupervised machine learning method to cluster patients, using phosphate, hemoglobin and iron-related markers. In two of the clusters, the oral use of a phosphate binder might improve prognosis. The number of patients on dialysis is increasing annually, and these patients have a high mortality risk due to various causes [1, 2] . Hyperphosphatemia is a common problem in patients undergoing dialysis. Because renal function decreases in these patients, phosphate excretion is reduced, leading to hyperphosphatemia, which causes vascular calcification and abnormal bone metabolism [3] . As chronic kidney disease (CKD) progresses, serum phosphate, fibroblast growth factor 23 and parathyroid hormone levels increase. In contrast, vitamin D levels decrease [4] . These shifts in serum markers can cause vascular damage, leading to cardiovascular death. Furthermore, hyperphosphatemia can induce apoptosis in the lungs, kidneys and muscles [5] . Patients with hyperphosphatemia have a significantly worse prognosis than patients without hyperphosphatemia in Japan [6, 7] . For patients with hyperphosphatemia, it is important to control phosphate levels. According to the 2017 Kidney Disease: Improving Global Outcomes guideline update recommendation, in patients with CKD G3a-G5d, physicians should decide on phosphate-lowering treatment based on progressively or persistently elevated phosphate [8] . However, the level of evidence for this recommendation was not graded. It is therefore critical to provide more concrete evidence regarding the lowering of phosphate levels. The oral use of phosphate binders has been previously studied. When the phosphate level is higher than 3.7 mg/dL, the phosphate binder significantly improves prognosis [9] . The mean serum phosphate level at 6 months was strongly associated with cardiovascular death [10] . Recently, novel iron-based phosphate binders have been developed [11, 12] , and the association between iron and phosphate has received increasing attention. Iron has been extensively studied, and its levels reflect chronic inflammation and atherosclerosis [13] . Particularly in patients with CKD, iron plays an important role in inflammation [14] . Serum ferritin levels can predict prognosis in patients on dialysis [15, 16] . However, even though the association between phosphate or iron and prognosis in patients on dialysis is obvious, there has been little research on the association between phosphate and iron dynamics. The Gaussian mixture model is becoming increasingly popular for modeling a wide variety of random phenomena for clustering classification and density estimation. The model assumes a multivariate Gaussian distribution for each component [17] . Unsupervised machine learning methods, such as the Gaussian mixture model, can be a powerful tool for detecting unknown patterns or phenotypes. Even though this has gained more attention recently, the Gaussian mixture model has rarely been used in clinical medicine. Because there is no evidence about to whom phosphatelowering treatments should be prescribed, we used a machine learning model to reveal potential clusters of phosphate and iron-related markers. We further studied the cluster in which phosphate binder could improve prognosis. Only patients who became stable and were discharged from the hospital and those who gave their consent to participate in the study were included. Patients who were not discharged and died in the hospital or who had outlier and missing values of phosphate, hemoglobin, TSAT and ferritin were excluded. Data from the Aichi Cohort Study of the Prognosis in Patients Newly Initiated into Dialysis [18] [19] [20] were used in this prospective multicenter study. The participants were patients who commenced dialysis between October 2011 and September 2013 at 17 Japanese institutions. Patients who initiated dialysis therapy from October 2011 to September 2013 were registered into the database, and their pre-dialysis data 3 months before dialysis initiation were also registered. Then, the data immediately before dialysis initiation were defined as the baseline. We followed up with these patients from dialysis initiation to September 2016. This study was approved by the Ethics Committee of the Institutional Review Board of Nagoya University Hospital (approval number 1335), and all patients provided written informed consent. First, we screened all patients with end-stage kidney disease (ESKD) who were undergoing dialysis. Only patients whose conditions became stable and who were discharged or transferred from the hospital were included in this study. Patients who were not discharged or who died in the hospital were excluded. For the laboratory data collection, we used data immediately before dialysis initiation. Because serum ferritin had some outliers, for example >30 000 ng/mL, we used the logarithm of serum ferritin (logFerritin) for analysis. Patients whose phosphate, hemoglobin and logFerritin values were outliers were excluded. We also excluded patients whose phosphate, hemoglobin, log-Ferritin and transferrin saturation (TSAT) records were missing ( Figure 1 ). Data regarding patients' background, medical history, comorbidities, medications and laboratory data during the period of dialysis initiation were collected. The users of phosphate binder were defined as patients who were on oral phosphate binders for more than 3 months before dialysis initiation. The phosphate binders were either calcium carbonate or lanthanum carbonate, and most patients used calcium carbonate because the Japanese insurance system allowed them to at the time. The dosage and total duration of use of the oral phosphate binders are unknown. Patients were followed up for 18 months (until the end of March 2015). The goal of our clustering approach was to divide patients into distinct subgroups. To achieve this goal, we used a Gaussian mixture model. We chose phosphate, hemoglobin, logFerritin and TSAT as clustering inputs for Gaussian mixture modeling, and we used the Smirnov-Grubbs test to exclude outlier values for phosphate, hemoglobin and logFerritin. First, from 1 to 10, the number of clusters was assessed using the Bayesian information criterion (BIC). Then, after choosing the most appropriate model and cluster number, we calculated the Gaussian mixture modeling (model selection: VVV, which is unconstrained variance across mixture components). We compared these four markers in each cluster with the Kruskal-Wallis test. Patients were divided into three clusters according to the clustering model. The primary endpoint was all-cause mortality. The causes of death were recorded to the maximum extent possible. The occurrence of death was investigated via survey slips sent to the dialysis facilities at the end of March 2015, until we finally obtained the replies. We compared the outcomes and univariate and multivariate Cox hazard ratios (HRs) among the three groups. The patients were divided into two groups in each cluster: those who used and those who did not use an oral phosphate binder. The primary endpoint was all-cause mortality. We compared the outcomes and univariate and multivariate Cox HRs between the two groups in each cluster. Baseline characteristics were presented descriptively and tested with the regular analysis of variance or χ 2 test. Survival was represented graphically using the Kaplan-Meier method and analyzed with univariate and multivariate Cox regression analyses. In the Kaplan-Meier method, we used the Bonferroni correction to show the pairwise comparisons between each cluster. In the Kruskal-Wallis tests, each cluster was tested with the Dunn test, and Bonferroni correction was used to adjust the Pvalue. Statistical significance was set at P < 0.05. R (version 4.0.0, R Foundation for Statistical Computing, Vienna, Austria; http:// www.R-project.org/) was used for all statistical analyses. The R package 'mclust' was used for clustering [17] . The initial population included 1524 participants. Two patients who were untraceable were excluded from the study. After we also excluded patients with outlier values (n = 7) and missing values (n = 340), 1175 patients remained in our cohort ( Figure 1 ). Among these, 785 were men and 390 were women, with a mean age of 67.9 ± 13.0 years and a mean follow-up duration of 796.7 ± 285.1 days. Figure 2 shows the BIC plot; the highest BIC was 3. Therefore, we chose the number of clusters to be 3. The clusters were named a, b and c. Figure 3 shows the clustering of the four markers. The baseline characteristics of these clusters are presented in Table 1 . As clustered by phosphate, logFerritin, hemoglobin and TSAT, there were significant differences in these four markers, as shown in the table. There was no significant difference among the clusters in terms of gender, the use of iron supplements and intact parathyroid hormone (iPTH). There were significant differences in age, causes of CKD, some past medical histories (diabetes mellitus, coronary artery disease, atrial fibrillation and admission due to heart failure), cardiothoracic ratio, ejection fraction, the use of medications [phosphate binder, calcium channel blocker, loop diuretic, angiotensin-converting enzyme inhibitor or angiotensin receptor blocker, vitamin D receptor agonist and erythropoiesisstimulating agent (ESA)], white blood cells, platelets, albumin, blood urea nitrogen, creatinine, sodium, potassium, adjusted calcium, iron, uric acid, beta-2 microglobulin, C-reactive protein, pH and bicarbonate. Figure 4 shows the Kruskal-Wallis test plots for each marker in each cluster. Among the clusters, there were significant differences in hemoglobin, logFerritin and TSAT ( Figure 4B -D, P < 0.05), and cluster b had the highest levels of logFerritin and TSAT and lowest hemoglobin. There were significant differences in phosphate levels between clusters a and b and clusters b and c. However, we could not observe any significant difference between clusters a and c ( Figure 4A ). Supplementary data, Figures S1-S4 show histograms of these four markers in each cluster. Among the clusters, there were significant differences in cardiovascular disease (CVD)-related deaths (Table 1) . Cluster c had the highest CVD-related and all-cause mortality among the three clusters. Figure 5A shows the Kaplan-Meier plot for allcause death among the three clusters. There was a significant difference among clusters (P = 0.041). There was a significant difference between clusters a and c (P = 0.042), but there was no significant difference between clusters a and b, and between clusters b and c (P = 0.485 and P = 1.000, respectively). Table 2 shows the univariate and multivariate Cox hazard models for all-cause mortality among the three clusters. Cluster c had a significantly worse prognosis than cluster a (adjusted HR = 1.52, P = 0.005), but there was no significant difference between clusters a and b (adjusted HR = 1.51, P = 0.084). For CVD-related death, Supplementary data, Figure S5 shows that among the clusters, there was a significant difference (P = 0.005). Cluster c was also significantly associated with a worse prognosis (Supplementary data, Table S1 ; adjusted HR = 2.22, P = 0.002). Figure 5B -D shows that the oral use of phosphate binder was associated with better prognosis in all clusters, except cluster b (cluster a: P < 0.001; cluster b: P = 0.34; cluster c: P < 0.001). Table 3 shows that in clusters a and c, phosphate binder was associated with better prognosis after adjustment (cluster a: adjusted HR = 0.32, P < 0.001; cluster c: adjusted HR = 0.52, P = 0.009). In cluster b, after adjustment, we did not observe an association between prognosis and use of oral phosphate binder (adjusted HR = 0.79, P = 0.689). This study aimed to use a machine learning model to reveal potential clusters of phosphate and iron-related markers of prognosis among patients undergoing dialysis and to detect the kind of patients who will benefit from oral phosphate binder therapy. Our current results suggest that unsupervised clustering by phosphate and iron-related markers can predict the prognosis of patients who are undergoing dialysis. In particular, cluster a (patients with the best-controlled level of phosphate and stable iron dynamics) showed a good prognosis. Moreover, in each cluster, the prognostic effect of oral use of phosphate binder differed. This suggests that it is important to consider multiple markers when prescribing oral phosphate binders to patients with ESKD. Previously, appropriate phosphate control has been well discussed [3] [4] [5] [6] . In these previous studies, the authors mainly discussed how to control the phosphate level by considering phosphate levels with or without calcium levels. In a metaanalysis, patients who had CKD stages 3-5d and were using sevelamer had lower all-cause mortality than those who were using calcium-based binders to treat hyperphosphatemia [21] . In patients with CKD, a retrospective study found that the use of phosphate binders was associated with a lower risk of mortality [22] . On the other hand, iron-deficiency anemia is a major problem in ESKD [14] . As it is often assumed, iron-related markers are associated with mortality [15, 16] . However, these studies are biased due to the study protocols. The randomized controlled trials included only patients who matched rigid protocols, and may not always reflect real clinical practice. If a study includes multiple factors such as phosphate, iron-related markers and anemia at the same time, it is easy to assume that the cohort would be heavily biased. In clinical practice, physicians usually assess prognosis with multiple parameters. Therefore, our analysis is significant for assessing prognosis, especially because our clustering results were similar to the physicians' experience, even though the Gaussian mixture model needs further studies to ensure superiority over traditional statistical methods. When interpreting our results, it is notable that the clusters are mainly grouped by TSAT levels. Cluster a included patients whose conditions were rather stable compared with the other two clusters. Moreover, cluster a had these characters; as Table 1 shows, past medical histories, such as diabetes and chronic heart failure, are seen less than the other two clusters. On the other hand, more patients in cluster a were on medications, such as phosphate binders, ESAs and angiotensinconverting enzyme inhibitors, and an angiotensin receptor blocker, than in clusters b and c. Considering these, patients in cluster a should be well controlled/treated for CKD. As shown in Figure 3 , cluster b included patients whose TSAT and phosphate levels were the highest among the three clusters. In cluster b, we could observe less diabetes and admission of heart failure. Normally, it could be assumed that cluster b patients should be in the best condition; however, they were on fewer medications than the other two clusters. We considered that cluster b was controlled poorly. Patients in cluster c seemed to be clinically unstable due to iron-deficiency anemia. Among the three clusters, cluster c had the worst past medical histories; however, against these backgrounds, the patients were treated with medications such as beta blocker and statin. Cluster c contains patients whose background is worse than those in other clusters. However, in cluster c more patients were on ESAs, and because of this, cluster c has a similar prognosis to cluster b, in which patients' condition is relatively good. On the other hand, cluster a had better prognosis than cluster c. In both clusters, the use of medications such as ESA and iron supplement is almost same. However, cluster c has lower TSAT, hemoglobin and ferritin than cluster a, meaning that the optimal levels of these markers should be higher than those of cluster c. These indicate the importance of controlling anemia. Regarding iron-related markers, we show that the combination of phosphate, anemia and iron-related markers can predict the prognosis of patients initiating dialysis. Considering these characteristics and Figure 3 , cluster b has patients with sparse laboratory results, meaning that patients in cluster b can be considered patients who could not fit clusters a and c-probably heterogeneous groups showing unstable iron dynamics. Interestingly, among the three clusters, there was no significant difference in iPTH. The patients' data in our cohort were recorded before dialysis initiation, which implies that the patients had ESKD. Therefore, because it is rather natural for ESKD patients to have hyperphosphatemia, hypocalcemia and elevated iPTH levels, we considered that it was better to add other markers such as iron-related markers to our clustering method. Because our clustering method is an unsupervised machine learning method that determines clusters algorithmically, we can only assume the cluster characteristics based on the results. Prognosis worsened from cluster a to cluster c. Moreover, as Figure 3 shows, our clustering grouped patients rather clearly, probably depending on TSAT, and we cannot deny other alternative approaches to clustering, including not only clustering methods but also marker selection. Even though there could be alternative clustering methods available, our current clustering method could differentiate patients into phenotypes that match clinical practice. Our results suggest that there are differences among clusters regarding prognosis based on the use of phosphate binders. We observed that the use of phosphate binders could be associated with a better prognosis of patients in clusters a and c. Considering that patients in cluster c had iron deficiency, it is possible that with decent control of both phosphate and iron, their prognosis could improve, suggesting that iron-conjugated phosphate binders may improve prognosis. However, in cluster b, we did not observe an association between prognosis and the use of phosphate binders. This suggests that when seeing patients with high TSAT levels, it may be better to control other problems first. Furthermore, as the patients' characteristics show, we could observe more CVD and diabetes in clusters a and c than in cluster b. This suggests the possibility of some patients considering the use of phosphate binders. In short, our study suggests that the combination of phosphate levels and iron-related markers may be a reliable tool for assessing both the prognosis of ESKD patients and the efficacy of phosphate binders. Our study is remarkable because it is the first study to reveal the potential relationship between phosphate, hemoglobin and iron-related markers using machine learning methods for predicting prognosis. Although unsupervised clustering analyses are becoming increasingly popular in the study of human diseases [23] [24] [25] [26] [27] , such analyses are rarely seen in clinical nephrology. Unlike classical approaches, machine learning methods can discover more homogeneous groups within heterogeneous sets of data [28] . In our study, we detected possible associations between phosphate, hemoglobin and iron-related markers. Moreover, we found a possibility for improvement in patients on dialysis with hyperphosphatemia. It is important to further study more sophisticated criteria for starting phosphatelowering treatments. Our study has several strengths. First, it involved a welldefined population and an extremely high follow-up rate. Second, we showed a potential association between phosphate and iron-related markers. Third, to our knowledge, this is the first study to use an unsupervised machine learning method to cluster patients who are initiating dialysis therapy. However, the study also has some limitations. First, as it was an observational study, there was an inevitable selection bias regarding the administration of phosphate binder. Second, we could not validate our results with other datasets, limiting our conclusions for further use. Third, our survival analyses could not entirely exclude selection bias because our clustering method does not divide patients without creating background bias, meaning each cluster has different patient characteristics, such as age, gender and past medical history. However, our method is not meant to match patients' backgrounds but rather to subtype patients. Therefore, our aim is different. Because our study showed an important association between phosphate, hemoglobin and iron-related markers for predicting prognosis, it is important to study this association further. Moreover, our study showed that phosphate binders could possibly improve the prognosis of some patients. Therefore, it is necessary to study whom to prescribe oral phosphate binders by observing not only phosphate values but also other markers. Currently, iron-based phosphate binders are becoming popular, and it is necessary to study how phosphate and ironrelated markers are associated with prognosis when using such medications. We used an unsupervised machine learning method to cluster patients using phosphate, hemoglobin and iron-related markers, which could predict prognosis. In two of the clusters, the oral use of a phosphate binder could be associated with a better prognosis. Isao Aoyama (Japan Community Healthcare Organization Chukyo Hospital), Hiroshi Ogawa (Shinseikai Daiichi Hospital), Hiroko Kushimoto (Nishichita General Hospital Current state of dialysis treatment and vascular access management in Japan An overview of regular dialysis treatment in Japan (as of Hyperphosphatemia of chronic kidney disease Role of Klotho in aging, phosphate metabolism, and CKD Cinacalcet for hemodialyzed patients with or without a high PTH level to control serum calcium and phosphorus: ECO (Evaluation of Cinacalcet HCl Outcome) study Effects of serum calcium, phosphorous, and intact parathyroid hormone levels on survival in chronic hemodialysis patients in Japan Serum phosphate and calcium should be primarily and consistently controlled in prevalent hemodialysis patients Executive summary of the 2017 KDIGO Chronic Kidney Disease-Mineral and Bone Disorder (CKD-MBD) Guideline Update: what's changed and why it matters Phosphorus binders and survival on hemodialysis Impact of longer term phosphorus control on cardiovascular mortality in hemodialysis patients using an area under the curve approach: results from the DOPPS A phase III study of the efficacy and safety of a novel iron-based phosphate binder in dialysis patients Ironing out the phosphorus problem New insights into the role of iron in inflammation and atherosclerosis Impact of inflammation on ferritin, hepcidin and the management of iron deficiency anemia in chronic kidney disease Association between serum ferritin and mortality: findings from the USA, Japan and European Dialysis Outcomes and Practice Patterns Study Serum ferritin predicts prognosis in hemodialysis patients: the Nishinomiya study mclust 5: clustering, classification and density estimation using Gaussian finite mixture models Presence of atrial fibrillation at the time of dialysis initiation is associated with mortality and cardiovascular events Aichi Cohort Study of the Prognosis in Patients Newly Initiated into Dialysis (AICOPP): baseline characteristics and trends observed in diabetic nephropathy Peripheral artery disease at the time of dialysis initiation and mortality: a prospective observational multicenter study Sevelamer versus calciumbased binders for treatment of hyperphosphatemia in CKD: a meta-analysis of randomized controlled trials Use of phosphorus binders among non-dialysis chronic kidney disease patients and mortality outcomes Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data Cognitive phenotypes 1 month after ICU discharge in mechanically ventilated patients: a prospective observational cohort study Deploying unsupervised clustering analysis to derive clinical phenotypes and risk factors associated with mortality risk in 2022 critically ill patients with COVID-19 in Spain ICU staffing feature phenotypes and their relationship with patients' outcomes: an unsupervised machine learning analysis. Intensive Thresher: determining the number of clusters while removing outliers We acknowledge the support of the following members of the Aichi Cohort Study of the Prognosis in Patients Newly Initiated into Dialysis who participated in this study: Hirofumi Supplementary data are available at ckj online. H.M. performed the data analysis and interpretation, and wrote the first draft of the article and subsequent revisions. D.I. conceived and designed the study and constructed the dataset. A.T. accessed the dataset, contributed to data analysis and interpretation, and provided feedback on the article. S.M. contributed to the study design, provided feedback on the article and approved the submitted version. The corresponding author attests that all listed authors meet authorship criteria and that no others who meet the criteria have been omitted. The authors did not receive a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors. All authors have no conflict of interest. No additional data are available.