key: cord-0284333-ab0sb7i3 authors: Goyal, R. title: Intracerebral Hemorrhage Detection in Computed Tomography Scans Through Cost-Sensitive Machine Learning date: 2021-10-22 journal: nan DOI: 10.1101/2021.10.20.21264515 sha: 919873460dc82dc5d6d77620cd4e715a37a0c745 doc_id: 284333 cord_uid: ab0sb7i3 Intracerebral hemorrhage is the most severe form of stroke, with a greater than 75% likelihood of death or severe disability, and half of its mortality occurs in the first 24 hours. Despite the grave nature of intracerebral hemorrhage and the high cost of false negatives in its diagnosis, only one study to date has implemented cost-sensitive techniques to minimize false negatives -- even as cost-sensitive learning has shown promise in other fields. In this study, 6 machine learning models were trained on 160 computed tomography brain scans both with and without utility matrices based on penalization -- an implementation of cost-sensitive learning. The highest-performing model obtained an accuracy of 97.5%, sensitivity of 95% and specificity of 100% without penalization, and an accuracy of 92.5%, sensitivity of 100% and specificity of 85% with penalization, on a dataset of 40 scans. In both cases, the model outperforms a range of previous work using other techniques despite the small size of, and high heterogeneity in, the dataset. Utility matrices demonstrate strong potential for sensitive yet accurate artificial intelligence techniques in medical contexts and workflows where a reduction of false negatives is crucial. displays a large intracerebral hemorrhage with intraventricular extension. Figure 1d 48 contains a small intracerebral hemorrhage without intraventricular extension. 49 A trained radiologist confirmed the veracity of these images, and was unable to find any mislabeled 50 images. Thus none of the images was discarded. To obtain the most accurate representation of model 51 performance in real-life clinical scenarios, the images were not augmented in any way. Two datasets 52 were subsequently created: a training dataset with 160 images and a testing dataset with 40 images. The root node is the first type of note, and it represents a decision that divides the entire data into 69 two or more subsets. Internal nodes represent more choices that can be used to further split subsets. Eventually, they end in leaf nodes, representing the final result of a series of choices. Any node 71 emanating from another node can be referred to as its child node [20]. 3 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Both discrete or continuous variables can be used by decision trees to set criteria for different nodes, 77 either internal or root, that are used to split the data into multiple internal or leaf nodes. Decision (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. For n-dimensional data, a hyperplane (straight line in a higher-dimensional space) of n-1 dimensions 110 is used to separate the data into two different groups such that the distance from the clusters is 111 maximized and the hyperplane is 'in the middle', so to speak. This hyperplane also dictates which 112 labels will be assigned to the samples from the test set [28]. For the second part of the study, the concept of utility functions was used. A utility function is a 130 mathematical function through which preferences for various outcomes can be quantified [31] . In The utility matrix created for this study is shown in Essentially, the models trained using this matrix will have an aversion to false negatives, thus algorithm is being implemented, the goals for sensitivity and specificity will vary and the cost must 142 be determined accordingly. In Round 2, the models were re-trained with the same parameters as the previous set, the only 144 difference being that the value for the 'UtilityFunction' parameter was set to the given utility matrix. The final results for all of the models trained are given in Table 3 . The probability histograms for the 147 3 most accurate models are given in Figure 2 . These histograms show the actual-class probability of 148 the 40 test samples. Table 4 lists the results for each of the models after setting their utility function 149 to the matrix given in Table 2 , for three values of −x. Finally, Figure 3 shows, for the top three 150 models, how their sensitivities and specificities change as the penalty is increased, with 0 being the 151 default penalty that corresponds to According to the data in Table 3 , the support vector machine model is the most accurate as well as 153 the most sensitive and specific. Logistic regression and gradient boosted trees score second and 154 third respectively in overall accuracy, though the nearest neighbors model has a higher specificity 155 than gradient boosted trees, coming in second and tied with logistic regression. Interestingly, while Table 3 shows that the support vector machine is more accurate overall, logistic In Table 4 , for a penalty of -1, three models give a 100% sensitivity, although for the decision tree 162 that comes at the cost of having a 0% specificity. The support vector machine performs the best, 163 matching the 100% sensitivity of gradient boosted trees. Its specificity of 85% is less than that 164 achieved by logistic regression, though its sensitivity is greater. The results for a -2 penalty are almost exactly the same, with the only difference being a slightly 166 reduced accuracy for logistic regression. When the penalty is increased to -3, model performances decrease along the board, except for 168 random forest, which performs better). While the sensitivities for almost all the models are now 169 100%, they come at the expense of drastically reduced specificities. Further values of −x were not tested as most model performances had started rapidly deteriorating 171 at a -3 penalty. The random forest, however, might perform better at more negative penalties, based 172 on the observed trends. 7 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted October 22, 2021. As expected, Figure 3 shows that greater penalties yield reductions in specificity and growth in sensitivity. Though in cases like Figure 3a and Figure 3c , where the sensitivity maxes out early, imbalanced EEG data sets. Int. J. Interact. Des. Manuf., 14(4):1491-1509, Dec 2020. [12] Ali Akbar Septiandri, Aditiawarman, Roy Tjiong, Erlina Burhan, and Anuraj Shankar. Cost- Artificial Intelligence: A Modern Approach. xn-3ug : 309 xn-2ug Prentice Hall The foundations of cost-sensitive learning Multi-class misclassification cost matrix for credit 316 ratings in peer-to-peer lending Detecting 318 RADnet: Radiologist level accuracy using deep learning for hemorrhage detection in CT 322 scans Application of Deep 325 Learning in Neuroradiology: Brain Haemorrhage Classification Using Transfer Learning Deep 328 3D convolution neural network for CT brain hemorrhage classification Analysis of Intracranial Hemorrhage in Ct Brain Images 332 Using Machine Learning and Deep Learning Algorithm Brain Hemorrhage Detec-335 tion based on Heat Maps, Autoencoder and CNN Architecture Informatics and Software Engineering Conference (UBMYK) Expert-level detection of acute intracranial hemorrhage on head computed tomography using 339 deep learning