key: cord-0028083-31qypodj authors: Liu, Haixia; Cui, Guozhong; Luo, Yi; Guo, Yajie; Zhao, Lianli; Wang, Yueheng; Subasi, Abdulhamit; Dogan, Sengul; Tuncer, Turker title: Artificial Intelligence-Based Breast Cancer Diagnosis Using Ultrasound Images and Grid-Based Deep Feature Generator date: 2022-03-01 journal: Int J Gen Med DOI: 10.2147/ijgm.s347491 sha: 0e9e245bec3bb34377a5b6c9791dc4eedd5d5396 doc_id: 28083 cord_uid: 31qypodj PURPOSE: Breast cancer is a prominent cancer type with high mortality. Early detection of breast cancer could serve to improve clinical outcomes. Ultrasonography is a digital imaging technique used to differentiate benign and malignant tumors. Several artificial intelligence techniques have been suggested in the literature for breast cancer detection using breast ultrasonography (BUS). Nowadays, particularly deep learning methods have been applied to biomedical images to achieve high classification performances. PATIENTS AND METHODS: This work presents a new deep feature generation technique for breast cancer detection using BUS images. The widely known 16 pre-trained CNN models have been used in this framework as feature generators. In the feature generation phase, the used input image is divided into rows and columns, and these deep feature generators (pre-trained models) have applied to each row and column. Therefore, this method is called a grid-based deep feature generator. The proposed grid-based deep feature generator can calculate the error value of each deep feature generator, and then it selects the best three feature vectors as a final feature vector. In the feature selection phase, iterative neighborhood component analysis (INCA) chooses 980 features as an optimal number of features. Finally, these features are classified by using a deep neural network (DNN). RESULTS: The developed grid-based deep feature generation-based image classification model reached 97.18% classification accuracy on the ultrasonic images for three classes, namely malignant, benign, and normal. CONCLUSION: The findings obviously denoted that the proposed grid deep feature generator and INCA-based feature selection model successfully classified breast ultrasonic images. Breast cancer is one of the leading causes of mortality in women worldwide, 12 and 2.26 million women were diagnosed with breast cancer in 2020. 3 According to the World Health Organization (WHO), this rate determined that breast cancer is the most common type of cancer among the women in the world. The fifth most common death rate (685.000) in the world is breast cancer. In the last 5 years, 7.8 million women were cured of breast cancer and survived. 3, 4 Breast cancer can occur in women of any age. However, it is more common in older ages. Early diagnosis is very important in breast cancer, as with all cancer types. Early diagnosis of breast cancer contributes to a reduction in the frequency of early deaths. [5] [6] [7] Ultrasound imaging is a useful diagnostic technique for detecting and classifying breast abnormalities. 8 Artificial intelligence (AI) is gaining popularity due to its superior performance in image-recognition tasks, and it is increasingly being used in breast ultrasonography (BUS). AI can provide a quantitative assessment by automatically identifying imaging data and making more accurate and reproductive imaging diagnoses. [8] [9] [10] As a result, the use of AI in breast cancer detection and diagnosis is crucial. 11 It may save radiologists time and compensate for certain beginners' lack of experience and expertise. We provided a machine-learning technique and proposed a strategy for analyzing benign and malignant breast tumor categorization in BUS images without requiring a priori tumor region-selection processing, reducing clinical diagnosis efforts while retaining good classification performance. This work aims to develop an AI-based diagnosis model of breast cancer detection using two-dimensional grayscale ultrasound images. AI techniques can perform well in identifying benign and malignant breast tumors. It has a potential to enhance diagnosis accuracy and minimize needleless biopsies of breast lesions in practice. Several studies on breast cancer detection in the literature are presented. Qi et al 12 proposed a breast cancer detection method using breast ultrasonography images. This method was based on deep neural networks. In their study, 8145 breast ultrasonography images were used. The accuracy rate for the two classes (malignant and non-malignant) is 90.13%. Eroglu et al 13 presented a classification method based on CNN. The main purpose of the study is to detect breast cancer from BUS images, which consist of three classes (benign, malignant, and normal). In their study, 780 BUS images were used and 95.60% accuracy was achieved with the support vector machine classifier. Drukker et al 14 CNNs for the interesting computer vision problem, a new deep learning framework is presented. Moreover, local features are very meaningful to yield high performance. Therefore, patch-based deep models have attained high classification performance such as vision transformers and multilayer perceptron mixers. However, patch/exemplarbased models generate huge-sized features. A new grid-based deep feature generator is proposed using 16 pretrained CNN models to utilize the effectiveness of the patch/exemplar-based models with fewer features. In this way, problems are solved by applying the proposed framework by the optimal deep feature generator selection and attaining high accuracy using fewer features (less complex feature generation procedure). The presented grid-based deep feature generator model is used to create a cognitive machine learning model. Each phase of this model is designed cognitively, and it can select the optimal pre-trained CNN models, which use an iterative feature selector and a deep classifier to attain maximum performance. Finally, a BUS dataset is selected to test the performance of our proposed framework, and we have achieved 97.18% accuracy on this dataset. We utilized the publicly available BUS image dataset collected from Behaye Hospital and the dataset is freely available at the web site. 22 The ethics committee of Behaye Hospital approved the study protocol. The data collected at baseline include breast ultrasound images from 600 female patients in ages between 25 and 75 years. This data were collected in 2018. The LOGIQ E9 ultrasound system and the LOGIQ E9 Agile ultrasound system were utilized in the scanning procedure. These devices are often utilized in high-end imaging for radiology, cardiology, and vascular applications. They generate images at a resolution of 1280*1024. The transducers on the ML6-15-D Matrix linear probe are 1-5 MHz. The dataset consists of 780 images with an average image size of 500×500 pixels. A BUS image dataset has been used to denote the comparative success of the proposed grid-based deep feature generator framework. This dataset is heterogeneous, and there are 133 normal, 437 benign, and 210 malignant images. The whole images are grayscale. The data were in DICOM format but converted into PNG format using a DICOM converter program. The BUS dataset contains three categories: normal, benign, and malignant. The total number of images acquired at the start was 1100. The duplicated images were eliminated. Furthermore, Baheya Hospital radiologists evaluated and corrected the inaccurate annotation. To remove unnecessary and insignificant borders from the images, all of them were cropped to different sizes. The image annotation is placed in the image caption. After preprocessing, the number of BUS images in the dataset was decreased to 780. The original images include irrelevant information that will not be used for mass categorization. Furthermore, they may have an impact on the training process's outcomes. To make the ultrasound dataset useful, ground truth (image boundary) is done. For each image, a freehand segmentation is generated. 22 A new grid-based deep feature generator is proposed in this work, and a new computer vision framework is employed by applying the proposed generator. The main purpose of our feature generator is to attain high classification ability like exemplar-based deep models, but exemplar/patch-based deep models have a complex time burden. The used ultrasonic image is divided into lines and columns to decrease the complexity of the exemplar feature generation without decreasing performance. For instance, by using a 5×5 sized grid in an exemplar model, 25 exemplars are obtained. The feature generator should extract features from the generated 25 exemplars. However, by using our proposed model, 10 grids are obtained a 5×5 sized mask. The other problem of the deep learning-based models is to choose the most appropriate network to solve the problem. Therefore, many researchers have used the trial and error method to find the best model to solve their problems. In this study, we proposed a framework by using 16 CNNs. The presented framework generates an error vector used to choose the best model(s). In this work, the proposed grid-based feature generator is imported into this framework to achieve high accuracy for this problem. The proposed framework uses INCA 23 to choose the most appropriate feature vector. The deep neural network is deployed to obtain results. The graphical outline of the proposed method is shown in Figure 1 . In the proposed framework, ultrasonic images are divided into grids, and eight grids (g1, g2, …, g8) are obtained. In the deep feature generation phase, the fully connected layer of each pre-trained model is used to feature extraction, and 9000 features are extracted from each ultrasonic image after the feature merging step. The top 1000 features of the generated 9000 features are chosen using NCA 24 feature selector. The misclassification rate of each pre-trained model is calculated using an SVM 25,26 classifier with 10-fold cross-validation. The SVM classifier is utilized as a loss function in this framework. By using the calculated loss values, the best pre-trained models are selected according to the computer vision problem. The final hybrid deep model for the ultrasonic image classification problem is shown in Figure 2 . Figure 2 denotes the proposed grid-based deep transfer learning framework to find an appropriate model for medical image analysis problem. Furthermore, the proposed framework has three main phases (feature extraction, feature selection, and classification) that are denoted in Figure 1 . Moreover, the pseudocode of this model is given in Algorithm 1. In testing phase, features are generated from images using the chosen optimal pretrained networks namely ResNet101, MobilNetv2 and EfficientNetb0. By deploying these three networks, three feature vectors with a length of 9000 have been created. Indexes of the most informative features are stored. By deploying these indexes, the most valuable features have been chosen. By using these indexes, there is no need to implement NCA and INCA in the testing phase. These features are classified by deploying a DNN classifier. More details about these phases are given in this section. The most complex phase of the proposed grid-based deep transfer learning framework is the feature extraction since the feature generation directly affects the classification ability of the learning model. The proposed grid-based deep feature generator model selects features two times by using NCA and loss values to extract the most appropriate features to solve the classification problem. Steps of the presented grid-based deep feature generation model are given below. Step 1: Create eight grids from the image. where im is the used ultrasonic image, g k represents kth grid, and by using Equations (1) and (2), eight grids are generated, w defines the width of the image and h represents the height of the image. Hence, vertical and horizontal grids have been created. By employing Eqs. (1) and (2), eight grids have been generated. Step 2: Generate features from grids and original ultrasonic images using 16 pre-trained networks, which are ResNet18, 27 ResNet50, 27 ResNet101, 27 DarkNet19, 28 MobileNetV2, 29 DarkNet53, 28 Xception, 30 EfficientNetbo, 31 ShuffleNet, 32 DenseNet201, 33 Inceptionv3, 34 InceptionResNetV2, 35 GoogleNet, 36 AlexNet, VGG16 37 and VGG19. 37 In this respect, this framework can be an extendable model, and more pre-trained networks can be added to generated features. These deep feature generators are selected by applying the proposed grid-based deep framework. The used pretrained networks were trained on the ImageNet dataset. This dataset contains million of images with 1000 classes. Hence, each pre-trained network generates 1000 features. To generate these features, the last fully connected of these networks have been utilized. where PN h defines the used hth pre-trained network, dim is the number of the used ultrasonic images, x h is hth generated feature vector with a length of 9000 and it is created from the original image and grids of the image. By applying Equations (3) and (4), 9000 features are generated using each pre-trained model. Equations (3) and (4) define feature extraction and merging phases together. Step 3: Reduce the dimension of the extracted feature vectors (x) deploying the NCA selector. where, fx h are selected features with a length of 1000. Equations (5) and (6) have been used to choose the most informative 1000 features from the generated 9000 features. The most informative/meaningful features are chosen using id h (qualified indexes according to the generated weights). Step 4: Calculate misclassification rates of each feature vector (x) deploying SVM classifier with 10-fold crossvalidation. Herein, 16 misclassification rates are generated. Step 5: Choose the best three pre-trained model using generated 16 loss values. Step 6: Merge the fxs to calculate the final feature vector. Herein, the best three feature vectors (fx id h ð Þ ) are chosen using loss values. In this work, the selected best pre-trained models for feature extraction are ResNet101, MobileNetV2, and EfficientNetb0. The created final vector (f ) has 3000 features. In the feature selection phase, INCA is deployed to choose the best feature combination, and details of the feature selection are explained in Section B. INCA is an iterative and developed version of the NCA, and it was proposed by Tuncer et al in 2020. 23 INCA uses a loss function to select the best features, and it has an iterative structure. It is a parametric feature selection method. Users can define the initial value of the loop, the final value of the loop, and the loss function. Generally, a classifier has been utilized as a loss function. Loop range is defined to decrease the time complexity of the INCA. The initial value of the loop value is set to 100, and End value of the loop value is set to 1000, and third degree (Cubic) SVM with 10-fold crossvalidation is utilized in the loss function. By using these parameters, the best features are selected from the generated 3000 features. The length of the selected best feature vector is found as 980. The last phase is the classification of the proposed grid-based deep learning model. 38 A deep neural network (DNN) is a form of artificial neural network (ANN) with two or more hidden layers. Because gradient computation of functions is required, the used DNN is a backward network that uses scaled conjugate gradient (SCG) for learning. The SCG algorithm employs the steepest descent direction. During the DNN implementation, the initial weights are randomly assigned, andĥ (input of hidden layers) are calculated using Eq. 8 . where W is assigned weights, x shows inputs, and f is the activation function. The weights are then recalculated using the backpropagation approach. In this phase, SCG is used, which is the steepest optimization approach, and it utilizes orthogonal vectors to minimize error. The mathematical notation of SCG is given in Eqs. 9-11. x ¼ ∑ n i¼1 s i d i (9) where s i is multiplier, d is orthogonal vector and x is input. Weights are recalculated using this optimization method. To evaluate the efficacy of the proposed feature extraction and selection framework, the selected 980 features are fed into the SCG-based three hidden layered DNN. There is currently no standard method for building an ideal deep learning model with an adequate number of layers and neurons in each layer. As a result, we experimentally developed the DNN through several trials. In each experiment, we carefully tuned the number of hidden layers, the number of nodes generating the layer for each hidden layer, the number of learning steps, the learning rate, momentum, and the activation function. We used a SCG optimization approach for the backpropagation method and adjusted the learning rate to 0.7, momentum to 0.3, and batch size to 100. To find the remaining DNN hyperparameters, we computed the classification accuracy using 10-fold cross-validation for each manual formation. 39 This method is repeated for various sizes of hidden layer representations. Following this laborious manual procedure, the best classification result is obtained with a DNN composed of three hidden layers of 400, 180, and 40 nodes, respectively. In this study, the scaled conjugate gradient is used as an optimizer. As an activation function, the tangent sigmoid is used. Furthermore, batch normalization is used in the model. A simple configured computer has been used to implement the proposed grid-based deep learning model. The used computer has 16 GB memory, an intel i7 7700 processor with 4.20 GHz clock, 1TB disk, and Windows 10.1 Professional operating system. The proposed grid-based deep learning model has been implemented by MATLAB 2020b programming tool. Firstly, we imported pre-trained networks to MATLAB using Add-Ons, and then our proposal was implemented using m-files. Pre-trained networks have been used with default settings. Any fine-tuning model has not been used on the pre-trained networks. Furthermore, any parallel programming method has not been used since transfer learning has been used to generate deep features. We used accuracy, recall, precision, F1-score, and geometric mean to evaluate the performance of the proposed gridbased deep learning approach. The obtained confusion matrix is presented in Table 1 . In this confusion matrix (see Table 1 ), true predicted and false predicted values are denoted. Moreover, class by a class recall, precision, and F1-scores are demonstrated in Table 2 . As shown in Table 2 , the proposed method reached over 96% for all performance metrics and yielded 97.18% classification accuracy. DNN classifier was used 10-fold cross-validation to achieve these results. Therefore, fold-wise accuracies are denoted in Figure 3 . Figure 3 demonstrated that our proposal yielded 100% classification accuracy on both the third and seventh folds. The worst classification accuracy was calculated as 85.90% for the first fold. In this work, we proposed a novel grid-based deep learning framework to attain high classification accuracy for breast cancer detection. Our proposed framework is a parametric framework, where 16 transfer learning methods are used to generate deep features. Moreover, eight grids are utilized. The top three pre-trained models are used to create a feature vector in the feature extraction phase. INCA was used to choose the best feature vector, and it selected 980 features as the best feature vector. In the classification phase, DNN is deployed with 10-fold cross-validation. According to the results, our proposed framework achieved 97.18% accuracy without using image augmentation method. The proposed grid-based model uses three feature selection methods. The first two feature selection methods are used in feature extraction: INCA and loss value generation based on top feature vectors' selection. In order to select top feature vectors, Cubic SVM was deployed, and the calculated accuracy rates (1-loss) are tabulated in Table 3 . Table 3 demonstrates individual grid results of the used pre-trained networks using Cubic SVM, and we highlighted the selected networks using bold font type. Our main aim is to increase the performance (the best accuracy is 90.38%, Table 3 ) of the breast cancer detection using BUS images. Therefore, we merged these feature vectors and applied INCA to these features. Misclassification rates and the number of features selected by INCA are shown in Figure 4 . According to Figure 4 , 980 features are selected to attain maximum classification accuracy. By using 980 features, the proposed model reached 93.59% accuracy deploying Cubic SVM. Furthermore, the feature concatenation and INCA process increase the maximum accuracy rate from 90.38% to 93.59%. The last phase of the grid-based deep learning model is classification. Cubic SVM was used to calculate error values to select the most informative features. In order to increase classification performance, a deep neural network (DNN) is used to increase the classification ability of proposed approach. According to Table 2 , DNN attained 97.18% classification accuracy. This classifier (DNN) increased the classification ability of the proposed model approximately 3.6%. These results clearly indicate that our grid-based model is a cognitive deep image classification model. The comparison with the state-of-the-arts is tabulated in Table 4 for the same dataset. Table 4 denotes that most of the works used deep learning models to achieve high classification rates, and some methods are used augmentation to attain high classification performance. Our model reached the best scores among these works (see Table 4 ). In computer vision applications, CNNs have generally been used to attain high classification results and there are many CNNs in the literature. These models have individual success rates on the image databases. This research proposed a general deep framework to solve the breast cancer classification problem. Therefore, feature extraction capabilities of 16 pre-trained networks have been tested on the used dataset. Fixed-size patch-based models have been used to generate local deep features (comprehensive features). However, this approximation is a complex approximation. To generate local deep features and decrease the time, grid division has been presented. This model is an explainable image classification model since it chooses the best/most suitable models for solving image classification problems. In this respect, this model is a self-organized deep feature extraction model. DNN (deep neural network) has been applied to features and achieved higher classification performance than other state-of-the-art methods. The important points of this work are given as follows. • Exemplar feature generators/deep models have attained high classification performances, but their time complexity is high. A grid-based model is proposed to decrease the time complexity of the exemplar model without decreasing the classification performance. • A novel deep image classification framework is proposed to choose the most appropriate pre-trained networks (CNNs). • We created an ultrasonic image classification method by using ResNet101, EfficientNetb0, and MobileNetV2 according to the results of our framework. • Any augmentation model has not been used to increase classification accuracy. International Journal of General Medicine 2022:15 • The proposed grid-based deep learning model is a cognitive ultrasonic image classification method for breast cancer detection. • Our grid-based model outperforms (see Table 4 ). • Our proposed framework can be used to solve other computer vision/image classification problems in future studies. • More and bigger datasets can be used to test the proposed approach. Ultrasonic image classification is one of the hot research topics for biomedical engineering and computer sciences since many diseases can be diagnosed using ultrasonic medical images. Furthermore, intelligent medical applications can be used in near future to save the time of both breast cancer patients and medical professionals. Therefore, automated models have been widely presented in the literature, and the flagship of the automatic classification methods is deep learning since deep networks have achieved higher performances. Therefore, various deep learning networks have been proposed. The main problem of deep learning is to select the appropriate models to solve the given problem. Therefore, we proposed a new grid-based deep learning framework to select the best performing networks automatically to detect breast cancer utilizing ultrasonic image dataset. By using this dataset, ResNet101, MobileNetV2, and EfficientNetb0 are selected in the proposed framework to create the best classification method. The created model achieved 97.18% classification accuracy using 10-fold cross-validation. Breast cancer A randomized trial of letrozole in postmenopausal women after five years of tamoxifen therapy for early-stage breast cancer Characteristics of breast masses of female patients referred for diagnostic breast ultrasound from a Saudi primary health care setting Diagnostic value of elastography, strain ratio, and elasticity to B-mode ratio and color Doppler ultrasonography in breast lesions Automated breast cancer detection and classification using ultrasound images: a survey Doubly supervised parameter transfer classifier for diagnosis of breast cancer with imbalanced ultrasound imaging modalities Breast tumor classification in ultrasound images using support vector machines and neural networks Artificial intelligence in breast ultrasound Automated diagnosis of breast ultrasonography images using deep neural networks Convolutional Neural Networks based classification of breast ultrasonography images by hybrid method with respect to benign, malignant, and normal using mRMR Automated method for improving system performance of computer-aided diagnosis in breast ultrasound Classification of breast ultrasound images based on posterior feature Sonography images for breast cancer texture classification in diagnosis of malignant or benign tumors Computer-aided detection of cancer in automated 3-D breast ultrasound Automatic identification of breast ultrasound image based on supervised block-based region segmentation algorithm and features combination migration deep learning model An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images Discrimination of breast tumors in ultrasonic images using an ensemble classifier based on the AdaBoost algorithm with feature selection Computer-aided diagnosis of solid breast nodules: use of an artificial neural network based on multiple sonographic features Dataset of breast ultrasound images Novel multi center and threshold ternary pattern based method for disease detection method using voice Neighbourhood components analysis The Support Vector Method of Function Estimation The Nature of Statistical Learning Theory Deep Residual Learning for Image Recognition YOLO9000: Better, Faster, Stronger Mobilenetv2: Inverted Residuals and Linear Bottlenecks Xception: Deep Learning with Depthwise separable Convolutions Efficientnet: Rethinking model scaling for convolutional neural networks An Extremely Efficient Convolutional Neural Network for Mobile Devices Densely Connected Convolutional Networks Rethinking the Inception Architecture for Computer Vision Inception-v4, Inception-Resnet and the Impact of Residual Connections on leArning Intriguing properties of neural networks Very deep convolutional networks for large-scale image recognition How transferable are features in deep neural networks An automated COVID-19 detection based on fused dynamic exemplar pyramid feature extraction and hybrid feature selection using deep learning Computer-aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks Breast mass segmentation in ultrasound with selective kernel U-Net convolutional neural network Identification of breast malignancy by marker-controlled watershed transformation and hybrid feature set for healthcare Breast ultrasound tumour classification: a Machine Learning-Radiomics based approach Automated diagnosis of breast cancer using multi-modal datasets: a deep convolution neural network based approach An efficient deep neural network based abnormality detection and multi-class breast tumor classification SHA-MTL: soft and hard attention multi-task learning for automated breast cancer ultrasound image segmentation and classification The International Journal of General Medicine is an international, peer-reviewed open-access journal that focuses on general and internal medicine, pathogenesis, epidemiology, diagnosis, monitoring and treatment protocols. The journal is characterized by the rapid reporting of reviews, original research and clinical studies across all disease areas. The manuscript management system is completely online and includes a very quick and fair peer-review system The authors report no conflicts of interest in this work.