key: cord-0777669-vrxd415i
authors: Oloko-Oba, Mustapha; Viriri, Serestina
title: A Systematic Review of Deep Learning Techniques for Tuberculosis Detection From Chest Radiograph
date: 2022-03-10
journal: Front Med (Lausanne)
DOI: 10.3389/fmed.2022.830515
sha: 889ece6205f5c993ec478fd712b196510602e818
doc_id: 777669
cord_uid: vrxd415i

The high mortality rate in Tuberculosis (TB) burden regions has increased significantly in the last decades. Despite the possibility of treatment for TB, high burden regions still suffer inadequate screening tools, which result in diagnostic delay and misdiagnosis. These challenges have led to the development of Computer-Aided Diagnostic (CAD) system to detect TB automatically. There are several ways of screening for TB, but Chest X-Ray (CXR) is more prominent and recommended due to its high sensitivity in detecting lung abnormalities. This paper presents the results of a systematic review based on PRISMA procedures that investigate state-of-the-art Deep Learning techniques for screening pulmonary abnormalities related to TB. The systematic review was conducted using an extensive selection of scientific databases as reference sources that grant access to distinctive articles in the field. Four scientific databases were searched to retrieve related articles. Inclusion and exclusion criteria were defined and applied to each article to determine those included in the study. Out of the 489 articles retrieved, 62 were included. Based on the findings in this review, we conclude that CAD systems are promising in tackling the challenges of the TB epidemic and made recommendations for improvement in future studies.

Tuberculosis (TB) is ranked among the leading causes of death. About 10 million persons fell ill globally from TB infections in 2019 (1) . TB is triggered by the Mycobacterium bacteria that usually affect the lungs (pulmonary) but sometimes affect other parts of the body (extrapulmonary) (2) . Many TB patients lose their lives yearly due to diagnostic delay, misdiagnosis, and lack of appropriate treatments (3, 4) . Although TB is a global challenge, the mortality rate is more prevalent in low and middle-income nations (5) .

TB is certainly treatable if diagnosed early for appropriate treatment. Early diagnosis is essential for successful treatment, preventing further spread, and significantly reducing the mortality rate in line with the World Health Organization (WHO) End TB Strategy (1) . The gold standard for TB screening is Sputum culture. However, posterior-anterior chest radiographs (CXR) are an effective technique with low-cost and moderately low radiation doses for screening lung abnormalities to achieve prompt results (6) . CXR has been adequately employed in developed countries to analyze individuals exhibiting active TB symptoms. At the same time, its application is limited in developing countries where TB is most prevalent (7, 8) . High TB burden regions lack the skilled and radiological expertise required to interpret CXR images adequately (9, 10) .

In the last decades, several efforts have been made using Artificial Intelligence (AI) to develop a Computer-Aided Detection (CAD) system to advance automatic object/image recognition tasks and overcome the challenges of a skilled workforce. Machine Learning (ML) and Deep Learning (DL) are the predominant AI techniques employed to develop CAD systems for analyzing CXR images. Both techniques have had a significant impact, but the DL approach, such as Convolutional Neural Network (CNN), has become more prominent for analyzing different pulmonary abnormalities in the medical domain, most importantly in diagnosing TB. The application of an efficient classification tool is vital for improving the quality of diagnosis while reducing the time taken to analyze a large volume of CXRs (11) . This endeavor is to achieve the global decline in TB incidence to about 5% annually compared to the current 2% yearly as part of the World Health Organization strategy to end TB (1) .

The contribution of this systematic review is to present an extensive summary of the various state-of-the-art CAD system proposed in the literature for the classification of TB. Ultimately, only the CAD system developed using Deep Learning models is considered in this study detailing the diagnostic accuracy between 2017 and 2021. The rest of the paper is structured as follows: section Methodology presents the study methodology. The results are presented in section Results. Section Discussion presents the discussion, while the conclusion is expressed in section Conclusion and Recommendations. 

This systematic review aims to establish various CAD systems related to Tuberculosis diagnosis from CXR using DL techniques. The study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) procedures (12) to identify the standards for inclusion and exclusion, as shown in Figure 1 . These standards were formulated based on the present study objectives and the research questions. All articles that satisfied the following conditions were included and excluded if otherwise:

• Articles that considered only pulmonary tuberculosis disease.

• Employed at least one deep learning technique as a classifier • CXR is the only medical imagine for screening tuberculosis.

• Articles published between January 2017 and September 2021.

• Articles are entirely written in English.

• Articles must be full text. All others, such as abstract, preprints are excluded.

The search strategy was developed to identify relevant published articles in Scopus (https://www.scopus.com/), IEEEXplore (https://ieeexplore.ieee.org/), Web of Science (www.webofscience.com), and PubMed (https://www.ncbi.nlm. nih.gov/pubmed/). Searches were performed across the four databases using the following search keywords: "tuberculosis, " "chest x-ray, " "classification, " "artificial intelligence, " "computeraided diagnosis, " "deep learning." These keywords were used to 

Scopus (TITLE-ABS-KEY ("Tuberculosis" AND "Chest X-Ray") AND TITLE-ABS-KEY ("Deep learning" OR "Machine learning" OR "Artificial Intelligence" OR "Classification") IEEEXplore (Tuberculosis) AND ("Chest X-Ray") AND ("Deep learning" OR "Machine learning" OR "classification" OR "artificial intelligence") AND ("CAD" OR "computer-aided detection")

Web of Science ((Tuberculosis AND Chest x-ray) AND ("Machine learning" OR "Deep learning" OR "Artificial intelligence") AND ("classification" OR "classify") AND ("computer-aided diagnosis" OR "CAD")) PubMed ("Tuberculosis") AND ("chest x-ray") AND ("deep learning" OR "convolutional neural network") AND ("classify" or "classification") OR ("computer-aided diagnosis" OR "computer-aided detection" OR "CAD")

form Boolean search strings according to searching standards on different databases. The configuration of keywords used to retrieve all relevant articles from each database search engine is presented in Table 1 .

A total of 488 articles were retrieved from the initial search on all the databases. The next step then scans the article topics and keywords to identify highly related papers from the irrelevant ones. This step is followed by the overview reading of the abstract, methods, and Conclusion to further screen for relevant papers for full-text reading and better understanding. Thus, 209 articles were considered for the title and abstract screening, out of which 96 papers were found suitable for full-text reading, and a total of 62 were finally included in the analysis after removing 34 duplicates. The detailed structure for the study selection is presented in Figures 1, 2 shows the numbers of articles included per year and databases. It is necessary to note that some relevant articles might have been unintentionally omitted.

The details extracted from each article are presented in Tables 

This study reviewed computer-aided diagnosis systems articles to detect pulmonary TB from January 2017 until September 2021. It was observed from the articles that the development of CAD follows a standard framework involving four steps as follows:

"pre-processing" is the first step which deals with cleaning up the CXR images by eliminating noise and enhancing for clarity. The second step is "segmentation" of a region of interest from the entire image, which is the lung field region in the case of CXR. The third step is the "Feature extraction, " where discriminative features are identified and selected for further analysis in the "classification" step, where the various images are categorized as normal or abnormal (infected) with TB. Several techniques such as handcraft, machine learning, and deep learning have been employed to diagnose TB, but DL has recorded more success in this regard; hence our interest was to analyze the CAD system based on one or more DL techniques as the classifier for TB detection. The descriptive analysis of the results is presented in Tables 2-5. These tables show the computational technique, study scope, datasets, evaluation criteria, and results. Articles that utilized CXR as the only imaging modality and employed DL as the only computational technique for developing CAD are considered.

Different screening procedures are used to confirm the presence of TB. Still, Chest Radiograph, otherwise referred to as Chest Xray (CXR), is a radiograph tool used to detect abnormalities in the lungs and nearby structures. In recent years, CXR has remained a vital method for screening TB and other lung diseases and hence recommended by WHO due to its high sensitivity, wide availability, and relatively less expensive (70, 71) .

Many DL techniques have been used for screening, predicting, and diagnosing TB. In most classification algorithms, a dataset is required for training and testing with many samples of inputs and outputs to learn from. Model is developed using the training set to calculate how best to map examples of input data to a specific class label; then, the model validation is accomplished using the test set (72) .

In Tables 2-5 , only the results obtained from the test set are extracted. Some studies employed more the one DL technique to find the optimal results for diagnosing TB diseases. Only the best results are documented in this study if more than one accuracy is reported in an article.

Data is crucial for developing CAD required to solve lifethreatening diseases, including TB, as one of the leading causes of worldwide death. The various popular datasets that have been used in developing TB detection algorithms contain de-identified CXR images to protect the privacy of the patients. In other words, the identity of the patients is not disclosed. Most of the datasets are accompanied by radiological interpretation of the observed manifestation that can serve as groundtruth. These datasets are made available to the research communities to foster state-of-theart research into finding lasting solutions to the early diagnosis of TB manifestations. Some of these public datasets include Montgomery County (73) Figure 3 shows the datasets frequency of use. 

Once a model is trained using some training images, the test dataset is then employed to assess the quality of the model. It is evident from the data extraction process that different evaluation metrics exist for assessing model performance. Most of the popular evaluation metrics applied to the development of CAD system includes accuracy, sensitivity, specificity, and AUC. These evaluation metrics are briefly explained as follows:

Accuracy is the rate of the correct samples from the total number of samples examined (77) . The equation gives the accuracy:

Sensitivity is the proportion of the actual positive samples that are correctly identified as positive (78) . This metric is given as:

Specificity is the ratio of the actual negative samples that are correctly identified as negative (78) . This metric is given as:

Precision, otherwise known as a positive predictive value, is the ratio of positive samples that are accurately predicted (77) . This is given as: 

This systematic review extensively searched the various Deep Learning classifiers for diagnosing TB from CXR. Automatic detection of TB has received mammoth attention in the last decades resulting in many publications with state-ofthe-art techniques. Figure 4 presents a hierarchical chart of computational techniques categorized according to the frequency of usage in CAD systems for TB based on the included articles. Despite some good accuracies reported in some studies, we found some limitations in the existing studies concerning methods and reported accuracy, which should be a point of consideration for developing CAD systems in the future. Many studies measured diagnostic accuracy without evaluating the risk of bias emerging from the datasets that were used. It is essential that the accuracy of CAD systems is evaluated using a different set of datasets (CXR images) from the set used for training. In other words, avoid

• Using the same set of CXR images for training and testing.

• Testing with CXR images that were not used for training but originated from the same image subset.

• Using images with class imbalance, and • Using unannotated CXR images.

Otherwise, the diagnostic accuracy evaluation is likely to be exaggerated and could impact the overall generalization of the system. In general, about 80% of the studies used the As evident from the literature, these pre-trained models (VggNet, ResNet, AlexNet, DenseNet, and Inception) are the most popular and have been extensively explored for the classification of TB, as shown in Figure 4 . Despite the effectiveness of Deep Learning models in detection and classification tasks, CAD systems for clinical diagnosis are still challenging in a real-world scenario. The physicians and radiologists see CAD intervention as a threat to their jobs rather than a supporting system to improve physicians' performance in terms of time, effort, efficiency, and affordability, especially in developing countries. This review found that most existing works focused on development studies rather than clinical studies.

One of the likely weaknesses of this review is that it is limited to only the studies written in English because there might be other high-quality studies written in other languages. Also, we restricted the computational techniques to only Deep Learning classifiers, which could increase the risk of classifiers bias where studies that employed a hybrid of both Machine Learning and Deep Learning could have achieved better performance. Furthermore, this study did not undertake meta-analyses due to variations of algorithms used. Also, the raw data required to meta-analyze the diagnostic accuracy for most studies were unavailable.

However, it is recommended that studies be carried out using standardized public datasets that contain additional masks of the images that can be used as groundtruth to detect the infected aspect of the pulmonary images. It is highly recommended that models are trained with a set of images and evaluated on a different set of images. For instance, training a model with the Shenzhen datasets and evaluating it on the Montgomery dataset will validate better generalization. It is also recommended that future CAD systems focus more on clinical evaluation and should be able to identify foreign objects such as buttons and rings that look like a nodule on the CXR images, which may lead to misclassification.

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Global Tuberculosis Report 2020: Executive Summary. Available online at

Causes of stigma and discrimination associated with tuberculosis in Nepal: a qualitative study

Deep-learning: a potential method for tuberculosis detection using chest radiography

The WHO 2014 global tuberculosis report-further to go

Deep learning for automated classification of tuberculosis-related chest x-ray: dataset specificity limits diagnostic performance generalizability

Chest Radiography in Tuberculosis Detection: Summary of Current WHO Recommendations and Guidance on Programmatic Approaches

The use of X-ray examinations in pulmonary tuberculosis

Use of chest radiography in the 22 highest tuberculosis burden countries

A statistical interpretation of the chest radiograph for the detection of pulmonary tuberculosis

Can tuberculosis patients in resource-constrained settings afford chest radiography?

Deep learning in neural networks: an overview

Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement

Detection of tuberculosis from chest X-ray images: boosting the performance with vision transformer and transfer learning

Evaluation of image processing technologies for pulmonary tuberculosis detection based on deep learning convolutional neural networks

Deep pre-trained networks as a feature extractor with XGBoost to detect tuberculosis from chest X-ray

Extreme learning machine based differentiation of pulmonary tuberculosis in chest radiographs using integrated local feature descriptors

Convolutional neural networks model for screening tuberculosis disease

Proposing a novel multi-instance learning model for tuberculosis recognition from chest X-ray images based on CNNs, complex networks and stacked ensemble

Ensemble learning based automatic detection of tuberculosis in chest X-ray images using hybrid feature descriptors

Tuberculosis detection from CXR: an approach using transfer learning with various CNN architectures

Deep learning methods for screening pulmonary tuberculosis using chest X-rays

Spatial pyramid pooling in deep convolutional networks for automatic tuberculosis diagnosis

Comprehensive computer-aided decision support framework to diagnose tuberculosis from chest X-ray images: data mining study

Study on the TB and non-TB diagnosis using two-step deep learning-based binary classifier

Deep learning for automated classification of tuberculosis-related chest X-Ray: dataset distribution shift limits diagnostic performance generalizability

A novel method for detection of tuberculosis in chest radiographs using artificial ecosystem-based optimization of deep neural network features

Inception-based deep learning architecture for tuberculosis screening using chest X-rays

Reliable tuberculosis detection using chest X-ray with deep learning, segmentation and visualization

Image enhancement for tuberculosis detection using deep learning

Tuberculosis abnormality detection in chest X-rays: a deep learning approach

Computer-aided system for the detection of multicategory pulmonary tuberculosis in radiographs

An efficient framework for identification of tuberculosis and pneumonia in chest X-ray images using Neural Network

Pre-processing effects of the tuberculosis chest x-ray images on pretrained cnns: an investigation

Modality-specific deep learning model ensembles toward improving TB detection in chest radiographs

Uncertainty assisted robust tuberculosis identification with bayesian convolutional neural networks

Ensemble deep learning for tuberculosis detection using chest X-Ray and canny edge detected images

Efficient deep network architectures for fast chest X-ray tuberculosis screening and visualization

Detection of pulmonary tuberculosis manifestation in chest x-rays using different convolutional neural network (CNN) models

Application of a convolutional neural network using transfer learning for tuberculosis detection

Deep learning models for tuberculosis detection from chest x-ray images

Utilizing pretrained deep learning models for automated pulmonary tuberculosis detection using chest radiography

Deep learning algorithms with demographic information help to detect tuberculosis in chest radiographs in annual workers' health examination data

An ensemble algorithm based on deep learning for tuberculosis classification

Ensemble deep learning for tuberculosis detection

Learning transformations for automated classification of manifestation of tuberculosis using convolutional neural network

Chest X-ray analysis of tuberculosis by convolutional neural networks with affine transforms

Chest X-ray analysis of tuberculosis by deep learning with segmentation and augmentation

Detection of tuberculosis patterns in digital photographs of chest X-ray images using deep learning: feasibility study

Detecting tuberculosis in chest X-ray images using convolutional neural network

Ensemble of convolution neural networks for automatic tuberculosis classification

Deep learning-based automated detection algorithm for active pulmonary tuberculosis on chest radiographs: diagnostic performance in systematic screening of asymptomatic individuals

A study on tuberculosis classification in chest X-ray using deep residual attention networks

Development and validation of a deep learning-based automatic detection algorithm for active pulmonary tuberculosis on chest radiographs

Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists

Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks

X-ray classification of tuberculosis based on convolutional networks

Comparative study for tuberculosis detection by using deep learning

Exploiting cascaded ensemble of features for the detection of tuberculosis using chest radiographs

Detection of tuberculosis from chest X-ray images based on modified inception deep neural network model

Object detection and segmentation in chest X-rays for tuberculosis screening

Hybrid RID network for efficient diagnosis of tuberculosis from chest X-rays

Computer-aided interpretation of chest radiography reveals the spectrum of tuberculosis in rural South Africa

iDoc-X: an artificial intelligence model for tuberculosis diagnosis and localization

CheXaid: deep learning assistance for physician diagnosis of tuberculosis using chest x-rays in patients with HIV

Application of convolutional neural networks for diagnostics of tuberculosis

Deep feature learning from a hospital-scale chest xray dataset with application to TB detection on a small-scale dataset

Classification of pulmonary tuberculosis lesion with convolutional neural networks

Comparing deep learning models for population screening using chest radiography

Feature selection for automatic tuberculosis screening in frontal chest radiographs

Interpretation of the chest radiograph. Continuing education in anaesthesia

The world health organization standards for tuberculosis care and management

4 Types of Classification Tasks in Machine Learning

Two public chest X-ray datasets for computer-aided screening of pulmonary diseases

Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison

A novel approach for tuberculosis screening based on deep convolutional neural networks

Clinical tests: sensitivity and specificity

The relationship between precision-recall and ROC curves

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.