key: cord-0981898-vph2fuo3 authors: Pei, Yun; Yang, Wenzhuo; Wei, Shangqing; Cai, Rui; Li, Jialin; Guo, Shuxu; Li, Qiang; Wang, Jincheng; Li, Xueyan title: Automated measurement of hip–knee–ankle angle on the unilateral lower limb X-rays using deep learning date: 2020-11-30 journal: Phys Eng Sci Med DOI: 10.1007/s13246-020-00951-7 sha: 131b5fb97e15d0cbae8bc45b8f98b65f6564e64e doc_id: 981898 cord_uid: vph2fuo3 Significant inherent extra-articular varus angulation is associated with abnormal postoperative hip–knee–ankle (HKA) angle. At present, HKA is manually measured by orthopedic surgeons and it increases the doctors’ workload. To automatically determine HKA, a deep learning-based automated method for measuring HKA on the unilateral lower limb X-rays was developed and validated. This study retrospectively selected 398 double lower limbs X-rays during 2018 and 2020 from Jilin University Second Hospital. The images (n = 398) were cropped into unilateral lower limb images (n = 796). The deep neural network was used to segment the head of hip, the knee, and the ankle in the same image, respectively. Then, the mean square error of distance between each internal point of each organ and the organ’s boundary was calculated. The point with the minimum mean square error was set as the central point of the organ. HKA was determined using the coordinates of three organs’ central points according to the law of cosines. In a quantitative analysis, HKA was measured manually by three orthopedic surgeons with a high consistency (176.90 ° ± 12.18°, 176.95 ° ± 12.23°, 176.87 ° ± 12.25°) as evidenced by the Kandall’s W of 0.999 (p < 0.001). Of note, the average measured HKA by them (176.90 ° ± 12.22°) served as the ground truth. The automatically measured HKA by the proposed method (176.41 ° ± 12.08°) was close to the ground truth, showing no significant difference. In addition, intraclass correlation coefficient (ICC) between them is 0.999 (p < 0.001). The average of difference between prediction and ground truth is 0.49°. The proposed method indicates a high feasibility and reliability in clinical practice. Genu Varum and Valgum refer to the natural straightening or standing of the lower limbs, with ankles or knees touching each other, while knees and ankles cannot be closed at the same time. Early symptoms are difficult to be found, but patients with severe deformities can cause osteoarthritis, patellar malacia, and other diseases due to the change of weight-bearing line of the lower extremities [1] . Early detection of HKA is of great significance to improve the prognosis of Genu Varum and Valgum. Varus malalignment has been reported in 53-76% of individuals with knee osteoarthritis [2] . HKA measured from full-length lower limb radiograph is one of the gold standards to diagnose knee malalignment. For the diagnosis of Genu Varum and Valgum, the most common method is to use X-ray images to measure hip-knee-ankle angle (HKA). For Genu Valgum, the Yun Pei and Wenzhuo Yang have been contributed equally to this work. patient's medial malleolus cannot close together, the lower limb is X-shaped; for Genu Varum, the patient's knee cannot close together, the lower limb is O-shaped. HKA is a measure of lower limb alignment, defined as the angle between the mechanical axes of the femur and the tibia which is measured from a full-length lower limb radiograph [3] . In addition, HKA is a common method to evaluate the anatomical structure of lower extremities, diagnose pathology, serve as a tool for operation planning and evaluate the success of surgery [4] . Currently, HKA is manually drawn and measured by professional surgeons on X-ray images. However, hospitals produce a large number of full-length X-ray images of lower limbs every day that it is difficult for orthopedic surgeons to keep up-to-date. In addition, doctors in some underdeveloped areas are undertrained in diagnosis. Therefore, there is an urgent need for a convenient and effective method to measure HKA. Traditional HKA measurement methods [5] [6] [7] rely on the doctors to calculate the angle. No automatic measurement system has emerged yet. Artificial intelligence is an advanced technology, which is able to automatically perform segmentation, classification and registration in medical images. Computer aided diagnosis using deep learning is gradually applied in medical image analysis [8] . Kang Zhang et al. [9] developed an AI system for accurate diagnosis of COVID-19. Moreover, computer aided diagnosis using deep learning is gradually applied in medical image segmentation. Yun Pei et al. [10] proposed a novel network with dual attention module to segment the colorectal tumor. Google Health developed an AI system for breast cancer screening [11] . Feng Shi et al. [12] reviewed of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for COVID-19. A novel technology is proposed to measure the HKA automatically. Different from the previous studies about HKA angle measurement, we used segmentation neural network to assist angle measurement. It is effective in angle prediction, which greatly saves the time of angle measurement for orthopedic surgeons. This study was approved by the Ethics Review Board of Second Hospital of Jilin University. We selected 398 patients (112 males and 286 females, age from 5-year-old to 85-year-old) who visited Second Hospital of Jilin University between October 2018 and August 2020. These patients underwent X-rays examinations for double lower limbs with the equipment from Philips Medical System. The Window Center is 2047 and the Window Width is 4095. Firstly, with keeping the image's proportion, the Dicom files which included original X-ray images and header file information were transferred into JPG format images. And then, the double lower limbs images were cut into unilateral lower limbs images. After that, the whole dataset included 796 images. Exclusion criteria are as the following: (1) hip replacement; (2) severe developmental dysplasia of hip; (3) knee replacement; (4) artificial limb; and (5) poor quality images. If an image has a knee replacement without hip replacement, it could be put into the dataset for segmenting the head of hip and be excluded from the dataset for segmenting the knee. If the image has an artificial limb, it could be excluded from the dataset for segmenting the knee and the ankle; however, it could be put into the dataset for segmenting the head of hip. We randomly selected 676 images to develop and validate three deep neural networks. Particularly, 80% of images were utilized to train the model while the rest of 20% images were used for validation. The left 120 images made up testing dataset which was used to test the accuracy of segmentation result and check the performance of calculating HKA. The details about the number of images are shown in Fig. 1 . The clinicians determined three points in the X-rays firstly in order to measure HKA. Three points located at the head of femur, the knee and the ankle. The proposed method in this study adopted deep neural networks to segment three organs respectively and locate the central point of each organ using a novel method. According to the coordinates of central points, HKA can be determined automatically. Deep neural network is fune-tuned based on U-Net [13] . It is made up of encoder module which attains abstract semantic information and decoder module which is used to restore the feature map from encoder module to the original size of input image. Encoding part is made up of convolution layers, rectified linear unit (ReLU) layers, batch normalization layers, and max pooling layers. Convolution operation focuses on digging out the local feature with a kernel of 3 × 3 while max pooling operation reduces the scale of the model parameters with a kernel of 2 × 2. Decoding part is consisted of deconvolution layers which perform inverse operations to amplify the shape of feature map, convolution layers, batch normalization layers, and ReLU layers. Between encoder module and decoder module, the network structure adopts the skipped connection to fuse the same size feature map from encoding part and decoding part. Different from traditional neutral networks, the sizes of input images in this model are not equivalent. In our network, we keep the original sizes of X-rays instead of reshaping them into the same size to ensure the HKA angle not to be changed. In the process of skipping connection, fusing operation requires the fused feature maps keeping the same size. With the convolution operation and max pooling operation in encoding part, length and width of feature map are uncertain. Therefore, before skipping connection, the network adopts bilinear upsampling to make the feature map from decoding module keep the same size as the feature map with the same number of channels from encoding module. The structure of deep neural network is shown in the Fig. 2 . The parameters of deep neural network is shown in the Table 1 . The neural networks are effective to segment three organs. The shapes of segmentation results are irregular. In order to calculate HKA, determining the central points of organs is necessary. An algorithm was defined then. The points of edge counter are defined as C(x j BDY , y j BDY ), j ∈ [1, n]; the internal points of segmentation area are defined as I(x i , y i ), i ∈ [1, m] and the distance from internal points to boundary are defined as d i, j , as shown in the Fig. 3 . n is the amount of total boundary points, and m is the amount of total internal points of segmentation result. Firstly, calculate the distances from I(x 1 , y 1 ) to C x BDY j , y BDY j as d 1, 1 , d 1, 2 ⋯d 1, n . And then, calculate the mean squared error (MSE) of d 1, 1 , d 1, 2 ⋯d 1, n . Repeat the above operation to obtain the MSE of all internal points. Finally, compare all the MSEs, and select the inner point corresponding to the smallest MSE as the center point. After obtaining the central points of three organs, law of cosines is used to calculate HKA. The testing data is divided into left lower limbs and right lower limbs. The horizontal standard line whose vertex is the central point of the knee face left when the picture is the left lower limb X-ray or face right when the picture is the right lower limb X-ray. We mark the central point of the head of hip as In the segmentation task, the outline of edge which belongs to segmented area and the overlap between prediction and ground truth both are crucial indexes to show the accuracy of segmentation. To evaluate the performance of segmentation network, three indexes are used in this paper. (In this study, pixels in the area of segmented organs are defined as positive pixels; others are defined as negative pixels.) Dice coefficient reflects the overlapping area between prediction and ground truth. The meanings of P and G present the number of positive pixels in prediction and ground truth. Recall represents the proportion of predicted true positive pixels to all true positive pixels. Precision represents the proportion of predicted true positive pixels to all predicted positive pixels. IBM SPSS Statistics 24.0 software was used to analyze the correlation. 95% confidence intervals (CIs) were calculated for continuous estimated parameters. Statistical significance was set at p < 0.01. Kandall's W and Univariate analysis were performed in order to examine the measurement consistency among these three orthopedic surgeons. Student's test and ICC were adopted to evaluate the similarity between prediction and ground truth values. The experiment platform equipped with one NVIDIA GeForce RTX 2080 graphics processor whose memory was 16 GB. The core processor was Inter Core i7-9700K CPU. The networks were trained and tested on Windows 10 system. Developing arithmetic adopts PyTorch 0.4.1 (https ://pytor ch.org/) as the basic frame and adopted Python 3.6 as programming language. When training three segmentation networks, we set the same parameters and used Adam as the optimizer. The learning rate was set to 0.001 and batch size was 1. Early Deep learning model structure was used to trained three times to complete segmenting three organs, separately. We used fivefold cross-validation to evaluate the deep learning model for segmenting the organs. Firstly, the dataset was randomly divided into five groups without repeated samples. One of the five groups was selected as the validation dataset, and the remaining four groups were used as the training dataset to train the model. The above two steps were repeated five times, so that each group was used as the validation dataset. The average of the results of the model on the validation dataset was calculated to evaluate the performance of the segmentation model. The dice coefficients of fivefold cross-validation in validation dataset are shown in the Table 2 . The average of dice coefficients in head of hip segmentation result is 0.8244; the average of dice coefficients in knee segmentation result is 0.9251; the average of dice coefficients in ankle bone segmentation result is 0.8988. We chose the third fold model parameters of head of hip, the first fold model parameters of knee and the first fold model parameters of ankle bone as the model parameters. The segmentation results in the testing data are shown in the Table 3 . Dice, recall, and precision of deep neural network compared with ground truth were 83.18%, 81.20%, and 86.74% for segmenting the head of hip, 93.01%, 90.75%, and 95.69% for segmenting the knee, 89.83%, 90.30%, and 89.79% for segmenting the ankle, respectively. Models for segmenting the head of hip, the knee, and the ankle were trained for 150 epoches in each fold. The sky blue area in the Fig. 5a presented the ground truth for segmentation and the sky blue area in the Fig. 5b presented segmentation result. The organs which are used to determine the central point coordinates are segmented by deep neural networks accurately. In the testing dataset, the head of hip, the knee and the ankle mainly coincident with the correct position of the organ. To validate the method, we compare the prediction result with the manual measuring HKA in the testing dataset individually using Biomet Orthosize Templating (Warsaw, Indiana, America, https ://www.ortho size.com/) by three orthopedists (with 13 years' experience, 10 years' experience and 7 years' experience). Three measurement results are statistically analyzed to evaluate their consistency. We adopt Kandall's W to calculate the similarity. The Kandall's W coefficient is 0.999 and p value is less than 0.001. It indicates a high reliability that three orthopedists' measurements of angle are consistent. We choose the average of the angle measured by three orthopedists as the ground truth. To compare the data distributions of manual measurement and prediction, the data is shown in the Fig. 6 . The maximum value, the minimum value, the upper quartile and the lower quartile of manual measurements and prediction are distributed in the same range. 120 X-ray images are tested to attain the value of angle. Statistical analysis is shown in the Table 4 . The mean of ground truth with standard deviation is 176.90 ° ± 12.22° and the mean of prediction with standard deviation is 176.41 ° ± 12.08°. ICC between ground truth and prediction indicates a high consistency. The value of ICC with 95% CIs is 0.999 (0.996, 0.999). The p value for ICC is less than 0.001; there is no significant difference between two groups. The average of difference between prediction and ground truth is 0.49°. The calculated angle ratio having a deviation of less than 1.5° from the ground truth is 89.17%, whereas it converges to 69.17% for a deviation of less than 1.0° ratio and 39.17% for a deviation less than 0.5°. The average of measurements from three surgeons is considered as ground truth. Bland-Altman plot with three standard curves shows the difference between prediction and ground truth. In the Bland-Altman plot (Fig. 7) , the solid line denotes the average that value is −0.4905 of all difference between prediction angle and ground truth, and the dashed lines denotes 1.96 standard deviations that value is 1.4792 away from the mean. Measuring HKA based on deep learning has yet to be developed. The traditional deep learning network requires input images with the same size. However, the length and width of different X-rays are not equivalent. In order to measure the angle, the aspect ratio of the images cannot be changed. In addition, deep learning is generally used for segmentation, detection or classification tasks rather than measuring the angle. Deep learning method need to match the complex post-processing operation to do that. Therefore, it's a challenge to apply the deep learning technology to measure the HKA. In order to achieve automatically determining HKA, we attempt to develop and validate an end-to-end artificial intelligence system. The new method for measuring HKA doesn't rely on physician; it adopts deep neural networks and a novel algorithm for searching central points of organs to automatically calculate angles. The prediction and orthopedists' measurements keep the high consistency. ICC between two groups reached 0.999 (p < 0.001), and new method saves doctors' time. Bland-Altman plot shows substantially narrower limits of agreement within ground truth and prediction. This measurement method proposed in this study is similar to the way that doctors measure the angle; it uses the computer algorithm to imitate doctor's work flow. As the result of the recognizable outline of the femoral head, the knee and the ankle, it's suitable to adopt deep learning algorithm to segment them. In three organs segmentation tasks, segmentation effect of the knee is better than others. This is because the edge contour of the knee is not surrounded by other organ or tissue so that deep neural network can extract features of the knee correctly. The head of hip is surrounded by pelvis and the ankle is near to the bottom of tibia, resulting in the outline of them hard to be found by deep neural networks, so some pixels are falsely predicted. There are some mispredicted pixels concentrated in the edge contour of segmentation. When system determining the central point of the organ, as long as the central area of the organs can be segmented, the coordinate of central point will be accurately predicted. We didn't use the center of mass as the central point of organ. Instead, we proposed a novel algorithm to search it, because we found that some segmentation results are not continuous regions, such as the Fig. 8 . The method we proposed can ensure that the center points are located inside the organs and are as close as possible to the points manually marked by doctors. Therefore, our method can effectively reduce the influence of noise in segmentation on determining the center point coordinates. For discontinuous regions, the center of mass cannot be calculated. In addition, the pictures in the testing dataset were randomly selected from all the data, so the testing data included bad contrast and endoprostheses, such as Fig. 9 . It proved our angle measurement system was much more robust. In clinical diagnosis, the orthopedic surgeons need to manually determine the central points of the three organs. It spends lots of time of doctors. Our method achieves automatically measurement. However, there are some limitations to this study. Deep learning algorithm relies on volume data. Currently, the data from single centre was utilized to develop a highly accurate system; in order to improve the robustness of system, data from different medical centres needed to be collected in the future. In the relevant research on the use of deep learning for HKA angle measurement, Thong Phi Nguyen, et al. [14] chose the detection algorithm to determine the position of organ. The detection algorithm used the box to surround the organs. Instead, the segmentation algorithm can accurately determine the contour of the organ so that the central points of the organs can be more closed to the points of doctors' note. In addition, our test set was larger and contained bad contrast and endoprostheses. Severe malalignment or rotational deformities of the lower extremity and patient positioning during the imaging can influence the accuracy of two-dimensional (2D) HKA measurement [15] . To solve this problem, three-dimensional (3D) lower limb reconstruction is used to determine the position of the organs. This technology requires patient is token X-rays twice (Patient is first positioned in the cabin standing with parallel feet free standing position. The second acquisition is performed with one leg slightly shifted to the other one) [16, 17] . This method requires the patient to be irradiated twice, increasing the patient's exposure to radiation. On the other hand, open source datasets such as Osteoarthritis Initiative (OA) and clinical practice only expose once, so it is impossible to measure the HKA with three-dimensional reconstruction technology. In addition, researchers observed the correlation between HKA and femur-tibia angle (FTA) on the Fig. 8 The discontinuous regions of segmentation results The segmentation results of bad contrast and endoprostheses knee radiograph. Calculating FTA only requires patients are token the knee x-rays. Its owns cost effectiveness and minimal radiation exposure [18] . In the following research, the FTA automatic measurement algorithm will be studied. We proposed a novel automatic HKA measurement method using deep learning algorithms. The method employed deep neural networks to segment the head of hip, the knee, and the ankle, and then searched the central point with the minimum MSE of distance between itself and boundary of organ. By the law of cosines, HKA was calculated according to the coordinates of three central points. With the new method, small difference was observed between prediction and ground truth and ICC has reached 0.999. The accuracy of predicted ankle values by system is similar to orthopedic surgeons, while it saves orthopedic surgeons' time. Genu varum and genu valgum in children: differential diagnosis and guidelines for evaluation Is there an alternative to the full-leg radiograph for determining knee joint alignment in osteoarthritis? Does measurement of the anatomic axis consistently predict hip-knee-ankle angle (HKA) for knee alignment studies in osteoarthritis? Analysis of long limb radiographs from the multicenter osteoarthritis (MOST) study Hip-Knee-Ankle (HKA) angle modification during gait in healthy subjects Due to great variability fixed HKS angle for alignment of the distal cut leads to a significant error in coronal TKA orientation What is the best hip center location method to compute HKA angle in computer-assisted orthopedic surgery? In silico and in vitro comparison of four methods Postoperative radiologic outcome comparison between conventional and computer-assisted navigation total knee arthroplasty in extra-articular tibia vara A survey on deep learning in medical image analysis Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of COVID-19 pneumonia using computed tomography Colorectal tumor segmentation of CT scans based on a convolutional neural network with an attention mechanism International evaluation of an AI system for breast cancer screening Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for COVID-19 U-net: convolutional networks for biomedical image segmentation Intelligent analysis of coronal alignment in lower limbs based on radiographic image with convolutional neural network Effect of limb rotation on radiographic alignment in total knee arthroplasties Are advanced three-dimensional imaging studies always needed to measure the coronal knee alignment of the lower extremity? Fast 3D reconstruction of the lower limb using a parametric model and statistical inferences and clinical measurements calculation from biplanar X-rays A comparison of five approaches to measurement of anatomic knee alignment from radiographs Conflict of interest There is no conflict of interest in this work. Research involving human participants and/or animals All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee (The Ethics Approval Certificate of Gazi University Ethics Commission dated 08/05/2018 and numbered 2018-217) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.