key: cord-0069405-koajsb0y authors: Ko, Sunho; Pareek, Ayoosh; Jo, Changwung; Han, Hyuk-Soo; Lee, Myung Chul; Krych, Aaron J.; Ro, Du Hyun title: Automated Risk Stratification of Hip Osteoarthritis Development in Patients With Femoroacetabular Impingement Using an Unsupervised Clustering Algorithm: A Study From the Rochester Epidemiology Project date: 2021-11-02 journal: Orthop J Sports Med DOI: 10.1177/23259671211050613 sha: d651d31f5abc31bb74f930c30e61e81fc227f5f2 doc_id: 69405 cord_uid: koajsb0y BACKGROUND: Studies evaluating the natural history of femoroacetabular impingement (FAI) are limited. PURPOSE: To stratify the risk of progression to osteoarthritis (OA) in patients with FAI using an unsupervised machine-learning algorithm, compare the characteristics of each subgroup, and validate the reproducibility of staging. STUDY DESIGN: Cohort study (prognosis); Level of evidence, 2. METHODS: A geographic database from the Rochester Epidemiology Project was used to identify patients with hip pain between 2000 and 2016. Medical charts were reviewed to obtain characteristic information, physical examination findings, and imaging details. The patient data were randomly split into 2 mutually exclusive sets: train set (70%) for model development and test set (30%) for validation. The data were transformed via Uniform Manifold Approximation and Projection and were clustered using Hierarchical Density-based Spatial Clustering of Applications with Noise. RESULTS: The study included 1071 patients with a mean follow-up period of 24.7 ± 12.5 years. The patients were clustered into 5 subgroups based on train set results: patients in cluster 1 were in their early 20s (20.9 ± 9.6 years), female dominant (84%), with low body mass index (<19 ); patients in cluster 2 were in their early 20s (22.9 ± 6.7 years), female dominant (95%), and pincer-type FAI (100%) dominant; patients in cluster 3 were in their mid 20s (26.4 ± 9.7) and were mixed-type FAI dominant (92%); patients in cluster 4 were in their early 30s (32.7 ± 7.8), with high body mass index (≥29 ), and diabetes (17%); and patients in cluster 5 were in their early 30s (30.0 ± 9.1), with a higher percentage of males (43%) compared with the other clusters and with limited internal rotation (14%). Mean survival for clusters 1 to 5 was 17.9 ± 0.6, 18.7 ± 0.3, 17.1 ± 0.4, 15.0 ± 0.5, and 15.6 ± 0.5 years, respectively, in the train set. The survival difference was significant between clusters 1 and 4 (P = .02), 2 and 4 (P < .005), 2 and 5 (P = .01), and 3 and 4 (P < .005) in the train set and between clusters 2 and 5 (P = .03) and 3 and 4 (P = .01) in the test set. Cluster characteristics and prognosis was well reproduced in the test set. CONCLUSION: Using the clustering algorithm, it was possible to determine the prognosis for OA progression in patients with FAI in the presence of conflicting risk factors acting in combination. Patients with hip FAI are known to be at an increased risk of hip osteoarthritis (OA), 2,3,6 yet little is known about the risk factors for hip OA. Recently, Melugin et al conducted a large cohort study of 1104 patients in the Rochester Epidemiology Project (REP) database analyzing risk factors for hip OA in patients with FAI without prior surgery. 16 REP is a unique population health resource and a collaborative initiative established in 1966 and originating in Olmsted County, Minnesota. 15 The candidate risk factors include characteristic information, radiologic measurements, and physical examination findings. The rate of OA was 13.5% and total hip arthroplasty (THA) was performed in 4% of patients. Male sex, body mass index (BMI) >29 kg/m 2 , and increased age were risk factors for OA. Independent risk factors for disease are analyzed using conventional statistics in many studies. However, patients have multiple risk factors and the relationship between risk factors is not independent. Therefore, prediction models based on machine learning have been developed to calculate the overall effect of these risk factors in orthopaedics, classifying patients into high-and low-risk groups. 11, 13 Certain risk factors are not predominantly seen in the entire patient group, but in specific groups, they can have a great effect on outcomes. However, it is difficult to identify these risk factors with existing conventional models. If patients are clustered based on topological relationship between variables, the risk stratification and the dominant risk factors in each cluster can be studied. The purpose of this study was to stratify the risk of progression to OA in patients with hip FAI using an unsupervised machine-learning algorithm, compare the characteristics of each subgroup, and validate the reproducibility of staging performed via clustering analysis using a geographic population with long-term follow-up. We hypothesized that (1) risk stratification for progression to OA of patients with hip FAI can be achieved with unsupervised machine learning. (2) There are several subgroups of patients with hip FAI for progression to OA, and there will be distinct risk factors for each subgroup. ( 3) The staging done by clustering algorithm is reproducible. This study was approved by an institutional review board. Patients who presented to physician with an International Classification of Diseases, 9th Revision (ICD-9) or 10th Revision (ICD-10), diagnostic code of hip pain, hip impingement, or hip joint disorders between January 2000 and December 2016 from the REP were reviewed. The REP is a medical record linkage system representing a populationbased data resource combining the medical records of our institution and other community providers in the city of Rochester and in Olmsted County, Minnesota. Patients between ages of 14 and 50 years were included. For all patients, time from initial hip pain to follow-up was calculated. The patient data were monitored until December 2019 (minimum follow-up of 3 years). Patients with a history of other hip disorders (avascular necrosis, neuromuscular disorder, trochanteric bursitis, hip fracture, pelvic fracture, and hip dislocation) or a history of hip arthroplasty or hip preservation procedures were excluded. All patients provided antero-posterior views and at least 1 lateral view (cross-table, frog-leg, or 45 Dunn) hip radiograph during their initial assessment. All radiographs were reviewed by attending or senior resident-level orthopaedic surgeons. The classification of FAI (cam, pincer, mixed) was based on radiographic criteria proposed by Clohisy et al. 5 The radiographic reviews were evaluated by 2 orthopaedic surgeons to confirm the consistency of radiographic measurement. Clinical charts were reviewed to determine the physical examination findings. Patients with symptoms consistent with hip FAI defined by Warwick agreement, 7 clinical signs, and radiographic findings were included. Patient characteristic factors were identified during chart review. Symptomatic hip OA was defined as symptoms requiring treatment and Tönnis grade 1 or higher on hip radiographs. Symptoms included pain, hip clicking, catching, locking, stiffness, restricted range of motion, or giving way. 7 The data collection and preprocessing steps were performed as described by Melgun et al. 16 The statistical analysis was performed using R 4.0 (R Core Team, http://www.R-project.org/), and Python 3.7 (Python Software Foundation, https://www.python.org). A descriptive statistical analysis of characteristic, radiographic, and physical examination features was performed. Chi-square and t tests were used to compare OA and non-OA groups. One-way analysis of variance was used for multiple group comparison. Survival curves were plotted using the Kaplan-Meier method, and log-rank test was used for comparison. Missing data were replaced by median value for numerical features and constant value for categorical features (0 for binary categorical variable, otherwise, closest category to the median value of the feature) Initial feature selection was performed to screen the OA and non-OA groups based only on significant differences in features to cluster the patients based on the risk factors for OA progression. The patient data were randomly split into 2 mutually exclusive sets: train set (70%) for model development and test set (30%) for validation. The features of train set and test set were compared to assess the adequacy of split. In unsupervised machine learning, the training data are not labelled, and thus the system learns without any guidance. Clustering can be used to group the patients without any labels first (train group), then we could use clustering to label the test group for validation, or to find out which cluster a new patient belongs to. In this study, the unsupervised machine-learning algorithm, dimension reduction, and the clustering algorithm were used to stage the risk of progression to OA in patients with hip FAI. Uniform manifold approximation and projection for dimension reduction (UMAP) is a powerful, nonlinear dimension reduction technique. 14 UMAP projects the data into low-dimensional space via conservation of its original topological structure. Hierarchical density-based spatial clustering of applications with noise (HDBSCAN) is an enhanced version of density-based spatial clustering of applications with noise (DBSCAN), which reflects the local density and the hierarchical structure of the data. The data were first projected into the 2-dimensional space via UMAP and then clustered by HDBSCAN. The clustering-based model was compared with the binaryclassification machine-learning model. The binaryclassification model was developed according to the gradient boosting machine algorithm. The model classifies patients into high-risk and low-risk groups. For comparison with the clustering technique, the test set of the previous study was clustered and visualized with the algorithm developed in this study. Overall, 1071 of 1893 patients met the inclusion criteria. Patients who had undergone hip arthroscopy or hip preservation procedures were excluded (n ¼ 208). The flowchart of process is shown in Figure 1 . Mean follow-up time was 24.7 ± 12.5 years. Mean age at the onset of pain was 28.5 ± 9.3 years, and males constituted 29.6% of the cohort. Progression to OA was detected in 149 (13.9%) patients over an average follow-up of 40.1 months. The average follow-up of non-OA patients was 95.4 months. Of the 37 variables, 14 were initially selected via univariate analysis. Full list of 37 variables can be found in Appendix Table A1 . The characteristics of OA and non-OA groups are listed in Table 1 . Follow-up duration of OA (40.1 ± 52.9 months) and non-OA (95.4 ± 58.9 months) groups were significantly different. Labral tear on magnetic resonance imaging (MRI) of OA (72%) and non-OA (52%) groups were also significantly different. Since MRI was taken electively, 740 (69%) patients do not have MRI data. Physical examination data also have large portion of missing values ranging from 26% to 33%. There were no significant differences between train and test sets (Appendix Table A2 ). Additionally, no difference in survival outcome was detected between the 2 sets (Appendix Figure A1 ) indicating excellent split between train and test datasets. Figure 2 shows the clusters of the train and test dataset via UMAP and DBSCAN. The train set was clustered into 5 subgroups ( Figure 2A ). Figure 2B annotates the patients who progressed to OA in the train set. Figure 2C shows the projection of test patients by the trained model. Figure 2D annotates the patients progressed to OA in the test set. Distribution of patients and progression to OA was well reproduced in the test set. The characteristics and progression to OA of the clusters in the train set are listed in Table 2 and shown in Figure 3 . Cluster 1 had a mean age of 20.9 years, was 16% male, all patients had a BMI of <19 kg/m 2 , and the mean Tönnis grade was 0.63. Cluster 2 had a mean age of 22.9 years, was 5% male, all patients had a BMI of 19 to 24 kg/m 2 , and the mean Tönnis grade was 0.74. Cluster 3 had a mean age of 26.4 years, was 29% male, all patients had a BMI of 19 to 24 kg/m 2 , and the mean Tönnis grade was 0.75. Cluster 4 had a mean age of 32.7 years, was 24% male, all patients had a BMI of 29 to 34 kg/m 2 in 58% and 34 kg/m 2 in 42%, and mean Tönnis grade was 0.10. Cluster 5 had a mean age of 30.0 years, was 43% male, all patients had a BMI of 24 to 29 kg/m 2 , and the mean Tönnis grade was 0.81. The corresponding characteristics and progression to OA of the testset clusters are shown in Table 3 and Figure 4 . Figure 5 shows the difference between a binaryclassification model and a clustering-based model. The patient groups based on clustering are clearly distinguished from other groups in terms of the risk or rate of OA progression and represent the group of patients who did not appear in the existing binary-classification, machinelearning model. The progression to OA in each group is presented in Figure 6 . The percentage of progression to OA (survivals) of each cluster are listed in Table 4 . In the train set, the mean survival for clusters 1 to 5 were 17.9 ± 0.6, 18.7 ± 0.3, 17.1 ± 0.4, 15.0 ± 0.5, 15.6 ± 0.5 years, respectively. According to the log-rank test, there were significant differences in survival between train-set clusters 1 and 4 (P ¼ .02), 2 and 4 (P < .005), 2 and 5 (P ¼ .01), and 3 and 4 (P < .005), likewise for the test-set clusters 2 and 5 (P ¼ .03), and 3 and 4 (P ¼ .01). The long-term prognosis of each group was clearly distinguished in both train and test groups in a similar fashion. Using the dimension reduction and clustering algorithms, patients with hip FAI are separated into 5 clusters. Characteristics and survival of each cluster were evaluated, and each cluster has a different risk for OA progression. Cluster Characteristics Table 2 clearly shows the difference between BMI and type of impingement between clusters in the train set. The best prognosis among the 5 clusters was seen in the cluster 2 patients, characterized by age in the early 20s (22.9 ± 6.7 years), Data are reported as mean ± SD or n (%). All variables were statistically significantly different between the study groups (P < .05). BMI, body mass index; ER, external rotation; IR, internal rotation; MRI, magnetic resonance imaging; OA, osteoarthritis. female dominant (95%), and pincer-type FAI (100%) dominant, with a mean survival of 18.7 ± 0.3 years. On the other hand, the worst prognosis was in the cluster 4 patients, characterized by age in the early 30s (32.7 ± 7.8 years), high BMI (29 kg/m 2 ), and diabetes (17%), with a mean survival of 15.0 ± 0.5 years. Patients in cluster 3 were in their mid-20s (26.4 ± 9.7 years), and were mixedtype FAI (92%) dominant, with a mean survival of 17.1 ± 0.4 years. Patients in cluster 1 were characterized by age in the early 20s (20.9 ± 9.6 years), female dominant (84%), and low BMI (<19 kg/m 2 ), with a mean survival of 17.9 ± 0.6 years. Cluster 5 patients were in their early 30s (30.0 ± 9.1 years) and were male (43%) dominant than the other clusters, with limited internal rotation (14%) and a mean survival of 15.6 ± 0.5 years. The relationship between the characteristics of each cluster and its prognosis can be explained by former study results. BMI >29 kg/m 2 , increased age, and male sex have all been identified as risk factors for OA progression. 10, 16, 18 The difference in the type of impingement between clusters may originate in the sex difference. Pincer type is predominant in females, while cam-type lesion is dominant in young male athletes. 1, 4, 8 This can be seen in cluster 2, which was female dominant (95%) and pincer-type dominant (100%), while cluster 3 had a higher percentage of males (29%) than cluster 2 and cam (8%), and were mixed-type (92%) impingement. There was no significant difference between clusters in MRI labral tear or physical examination data, despite their clinical importance. MRI labral tears or physical examination data are binary categorical features with large missing value. For binary categorical features, the imputation of missing values can be done only by constant value-usually and in this case zero, resulting in only small portion among the whole patients having positive values. Because the clustering algorithm divides subgroups based on features that 'group' the 'whole patients', influence of features with only small percentage of positive value inevitably decreases in clustering modeling. Classification models predicting high-risk groups and survival regression models such as random survival forest was mainly used in prognostic models using supervised machine-learning algorithm. [11] [12] [13] The classification model facilitates the interpretation of the result clearly. Risk status, sensitivity, and specificity data provide clear insight into patient status. However, the follow-up time is not considered in modeling of long-term follow-up data. In the case of survival regression model, a quantified survival curve can be obtained. However, the c-index, a widely used evaluation metric, determines the order of events (progression to OA), but not the exact survival time. However, the There are limitations in the patient population. Although the mean age of the group was young, patients up to 55 years old were included in the study. The risk of OA is expected to increase with age. The patients were from single location, and the dominant race was White. Additional multiregional studies are needed to determine whether this model can be generalized to patients in other regions. There may be disagreement on the definition of symptomatic hip OA. In this study, symptomatic hip OA was defined as symptoms requiring treatment and Tönnis grade 1 on hip radiographs. Some may argue that including Tönnis grade 1 is too stringent in determining significant OA. The indication for THA was not standardized in all patients, so this group was excluded and not analyzed separately. There are limitations in the modeling algorithm. Because the clustering method does not provide model interpretability (black box), the modeling process must be presumed based on the characteristics of each cluster. Therefore, another statistical analysis is required to determine the effect of specific factors. The model may not provide cluster information for some patients. Although all patients in the test set were successfully classified in to 5 clusters, a patient from different cohort can have different characteristic, and may not belong to any clusters. UMAP clusters the data based on the topological relationship between input data-which means that causal relationship between the features is not considered. Therefore, domain knowledge is required to interpret the causal relationship between the features. Application of clustering algorithm to other patient groups cannot stratify prognosis clearly if risk factors in each cluster shift favorably or unfavorably. However, this also has clinical implications in that it is possible to determine the prognosis in the presence of conflicting risk factors acting in combination in actual patients. Therefore, the clustering algorithm is of high clinical value. The candidate risk factors for OA progression in patients with FAI were selected; then, unsupervised machinelearning algorithm was applied for stratifying the risk of OA progression. The clusters were characterized by BMI, type of impingement, and sex, which were identified as independent risk factors for OA progression by conventional statistics. Data are reported as n (%) or mean ± SD. BMI, body mass index; ER, external rotation; IR, internal rotation; MRI, magnetic resonance imaging; OA, osteoarthritis. Figure A1 . Kaplan-Meier curve of train and test sets. Shading represents 95% CI. Prevalence of associated deformities and hip pain in patients with cam-type femoroacetabular impingement Predictors of progression of osteoarthritis in femoroacetabular impingement: a radiological study with a minimum of ten years follow-up Hip morphology influences the pattern of damage to the acetabular cartilage: femoroacetabular impingement as a cause of early osteoarthritis of the hip Femoroacetabular impingement A systematic approach to the plain radiographic evaluation of the young adult hip Femoroacetabular impingement: a cause for osteoarthritis of the hip The Warwick Agreement on femoroacetabular impingement syndrome (FAI syndrome): an international consensus statement Prevalence of cam-type femoroacetabular impingement morphology in asymptomatic volunteers Incidence of femoroacetabular impingement and surgical management trends over time The relationship between body mass index and hip osteoarthritis: a systematic review and meta-analysis Transfusion after total knee arthroplasty can be predicted using the machine learning algorithm Deep learning-based survival prediction of oral cancer patients A web-based machine-learning algorithm predicting postoperative acute kidney injury after total knee arthroplasty Uniform Manifold Approximation and Projection for Dimension Reduction History of the Rochester Epidemiology Project Risk factors for long-term hip osteoarthritis in patients with femoroacetabular impingement without surgical intervention Femoroacetabular impingement: defining the condition and its role in the pathophysiology of osteoarthritis Maximum lifetime body mass index is the appropriate predictor of knee and hip osteoarthritis