key: cord-0745280-c3i0e17n authors: Hsu, William; Baumgartner, Christian; Deserno, Thomas M. title: Notable Papers and New Directions in Sensors, Signals, and Imaging Informatics date: 2021-09-03 journal: Yearb Med Inform DOI: 10.1055/s-0041-1726526 sha: 60295fe55ec928761ec5c660e0f2334bcd2e7b95 doc_id: 745280 cord_uid: c3i0e17n Objective: To identify and highlight research papers representing noteworthy developments in signals, sensors, and imaging informatics in 2020. Method: A broad literature search was conducted on PubMed and Scopus databases. We combined Medical Subject Heading (MeSH) terms and keywords to construct particular queries for sensors, signals, and image informatics. We only considered papers that have been published in journals providing at least three articles in the query response. Section editors then independently reviewed the titles and abstracts of preselected papers assessed on a three-point Likert scale. Papers were rated from 1 (do not include) to 3 (should be included) for each topical area (sensors, signals, and imaging informatics) and those with an average score of 2 or above were subsequently read and assessed again by two of the three co-editors. Finally, the top 14 papers with the highest combined scores were considered based on consensus. Results: The search for papers was executed in January 2021. After removing duplicates and conference proceedings, the query returned a set of 101, 193, and 529 papers for sensors, signals, and imaging informatics, respectively. We filtered out journals that had less than three papers in the query results, reducing the number of papers to 41, 117, and 333, respectively. From these, the co-editors identified 22 candidate papers with more than 2 Likert points on average, from which 14 candidate best papers were nominated after intensive discussion. At least five external reviewers then rated the remaining papers. The four finalist papers were found using the composite rating of all external reviewers. These best papers were approved by consensus of the International Medical Informatics Association (IMIA) Yearbook editorial board. Conclusions. Sensors, signals, and imaging informatics is a dynamic field of intense research. The four best papers represent advanced approaches for combining, processing, modeling, and analyzing heterogeneous sensor and imaging data. The selected papers demonstrate the combination and fusion of multiple sensors and sensor networks using electrocardiogram (ECG), electroencephalogram (EEG), or photoplethysmogram (PPG) with advanced data processing, deep and machine learning techniques, and present image processing modalities beyond state-of-the-art that significantly support and further improve medical decision making. Sensors, signals, and imaging informatics (SSII) continues to be a rapidly growing research field. One could see three independent parts, or at least two, if imaging and signal informatics are considered similar to a biomedical signal as one-dimensional and a medical image as a two-or more-dimensional stream. However, the methods applied here are similar. In contrast, the sensor's part could be seen as more device-oriented. Picard & Wolf define "sensor informatics" as new technologies and applications for medical services incorporating wearable sensors, signal processing, machine learning, and data mining techniques [1] . In our view, the technological development of a sensing device is not part of medical informatics but the integration of such devices in medical information systems papers with more than 2 Likert points on average, from which 14 candidate best papers were nominated after intensive discussion. At least five external reviewers then rated the remaining papers. The four finalist papers were found using the composite rating of all external reviewers. These best papers were approved by consensus of the International Medical Informatics Association (IMIA) Yearbook editorial board. Conclusions. Sensors, signals, and imaging informatics is a dynamic field of intense research. The four best papers represent advanced approaches for combining, processing, modeling, and analyzing heterogeneous sensor and imaging data. The selected papers demonstrate the combination and fusion of multiple sensors and sensor networks using electrocardiogram (ECG), electroencephalogram (EEG), or photoplethysmogram (PPG) with advanced data processing, deep and machine learning techniques, and present image processing modalities beyond state-of-the-art that significantly support and further improve medical decision making. Yearb Med Inform 2021:150-8 http://dx.doi.org/10.1055/s- and their application in research, clinical trials, and medical care. Unobtrusive health monitoring in private spaces such as the car or the home is based on various sensors and is expeditiously growing in research and applications [1] [2] [3] . Witte et al. see "signal informatics" as an advanced integrative concept in the framework of medical informatics [4] . Again, data integration is emphasized. In the medical field, semantical integration is particularly important. In a recent review, Cook defines "imaging informatics" via the imaging informaticist as a unique individual who sits at the intersection of clinical radiology, data science, and information technology [5] . Imaging informatics, however, is the most common term of these three. In 2020, we observed three new reviews in this field [5] [6] [7] . Furthermore, the first standardized curriculum for imaging informatics fellowships suggested by the Society for Computer Applications in Radiology (SCAR) in 2004 [8] has been updated [9] . In this variety, we faced the daunting task of identifying notable research. Furthermore, thousands of research papers have applied deep learning approaches to well-documented SSII problems -mostly outperforming the classical state of the art. On the one hand, we faced a needle in the haystack problem given the large number of papers, but simultaneously, on the other, we felt as if we were sorting through genetically modified strawberries, where all looked alike. In summary, identifying real novelty among papers in SSII is not easy if considering solely the title or the abstract. As van Ooijen, Nagaraj & Olthof [10] point out, SSII is more than 'just' deep learning. As such, our objective was to catch the spectrum of work-both deep learning and other machine learning techniques-that best represent the developments in SSII in 2020. The process of searching the literature for candidate best papers by the SSII section remained a challenging task, given the broad nature of the SSII category. This year, we overhauled the queries that were applied over the last years [11] [12] [13] . First, we developed similar queries to search PubMed and Scopus. We focused on research articles in the English language and excluded all others. We did not include review papers. Then, we built the queries separately for sensors, signals, and images. Each of the six queries was built from two blocks. For sensors, the first block captures all relevant terms for the sensing device, and the second term lists keywords for biomedical signals and vital signs. For signals and images, we used modality names (e.g., computed tomography, magnetic resonance angiography) and processing techniques. In addition, we significantly decreased the number of results for the imaging query by applying the MeSH term "medical informatics". The queries are listed completely in Appendix 2. In mid-January 2021, we executed the final query. After removing duplicates and conference proceedings, the query returned a set of 101, 193 and 529 papers for sensors, signals, and imaging informatics, respectively (Table 1) . Given these numbers, we sought to further reduce the number of papers needed to be reviewed. During the initial review of abstracts, we noted that several excluded papers were published in journals that were only tangentially related to the topic. We started exploring whether setting a threshold of the minimum number of papers in the results per journal would help. Figure 1 illustrates the effect of setting different thresholds and the number of papers remaining in the imaging informatics set. If we set the threshold to be at least three papers, 333 papers remained. Furthermore, we also discussed focusing on official International Medical Informatics Association (IMIA) journals, impact factors, or the set of journals that are linked to "medical informatics" in MEDLINE or Index Medicus. However, as new open access journals are broadly establishing themselves with broad acceptance by scientists, we kept on the contribution model. Next year, we plan to normalize the threshold by the annual number of papers that were published in the journal. We reviewed the titles and abstracts and independently, ranked them on a three-point Likert scale. Papers were rated from 1 (do not include) to 3 (should be included) for each topical area (sensors, signals, and imaging informatics). Each paper was assessed by two of the section co-editors. In doing so, we identified 22 candidate papers with more than 2 Likert points on average. These papers were read entirely and rescored by all co-editors. After intensive discussion, we nominated 14 candidate best papers. At least five external reviewers then rated the papers. The four finalist papers were were identified based on the composite rating of all external reviewers. These best papers were approved by consensus of the IMIA Yearbook editorial board. This year's literature yielded a number of interesting developments in the field of SSII. We highlight three emerging trends, particularly in light of the COVID-19 pandemic. One trend has been the use of machine learning to extract additional physiological information from existing measurements. For example, measurements from a photoplethysmogram (PPG), which measures blood volume changes using a pulse oximeter, can potentially provide systolic and diastolic blood pressure measurements. Hsu et al. [14] showed using a fully connected deep neural network trained on the MIMIC (Multi-parameter Intelligent Monitoring for Intensive Care) II cohort achieves low mean absolute error and root mean squared error when estimating blood pressure from PPG compared to reference measurements using a cuff. Miao et al. [15] proposed continuous blood pressure measurement using electrocardiogram (ECG) signals using a fusion of a residual network and long short-term memory model. Their methods were trained and evaluated using data from the MIMIC III cohort as well as patients from their institution. Another interesting application is the interpretation of ECG signals to predict hypoglycemic events. Using a commercially available wearable ECG monitor, Porumb et al. [16] demonstrated one approach using a combined convolutional and recurrent neural network. The outbreak of COVID-19 reinforced the utility and importance of using data from wireless sensors and consumer wearables to track infections. Wearable health monitors such as PPGs have played an important role in providing markers of respiratory health (cough frequency/intensity, respiratory rate/effort). Novel sensing materials and fabrication techniques are yielding ways to unobtrusively monitor symptoms, record lung sound, and measure the respiratory rate. Ding et al. [17] provide a review of these developments in greater depth, while Wang et al. analyze unobtrusive health monitoring in vehicles [2] and homes [3]. Advances in SSII have also been driven by the availability of large, diverse, publicly available patient data. MIMIC [18] remains an important resource for developing and testing algorithms on signals and sensors data collected in an intensive care environment. The UK Biobank is another rich open-access resource that collected multi-modal imaging data on a subset of 100,000 participants [19] . The pandemic has also spurred the establishment of new resources, focusing on COVID-related diagnosis and treatment. In the United States, the national COVID Cohort Collaborative is building a national data resource to accelerate the development of therapeutics [20] . Informatics challenges related to indexing, security, standardization, data provenance, and linked clinical outcomes will need to be addressed. The RSNA International COVID-19 Open Radiology Database is another example of annotated and de-identified chest computed tomography (CT) and radiography data on COVID-19-positive patients [21] . Complementing efforts to collect larger real-world patient datasets and techniques for introducing reasonable simulated data to improve model training are also examined. Augmentation is being pursued to generate additional synthetic examples. Shi et al. [22] proposed a knowledge-guided adversarial augmentation approach to generate additional training exams for training a model for classifying thyroid ultrasound as benign or malignant. Leveraging standardized terms describing nodules extracted from radiology reports, the authors showed that their knowledge-guided auxiliary classifier generative adversarial network outperforms other deep learning-based comparison methods. Despite the large numbers of Artificial Intelligence/Machine Learning (AI/ML) models being published, the translation of these works to clinical practice is slow. One hurdle is the differences between characteristics of the target population relative to the population that the models were trained on. Fu et al. investigated the impact of differences in dose and reconstruction kernel on AI-based detection of simulated pulmonary nodules in chest phantoms scanned using CT [23] . The authors found that while differences in dose did not affect the volume measurements using the four evaluated AI algorithms, the kernel had a significant effect. Moreover, some models were more adversely impacted than others, but transparency regarding model training (e.g., population characteristics used for training) is generally lacking. With the rapid pace of research into COVID-19, a well-documented concern has been that most published models are poorly reported and at high risk of bias [24] . Similarly, in signals and sensors research, most research studies are based on data from a single institution, which raises concerns about bias, and many studies do not report performance on an external test cohort [25] . We anticipate that as the field moves towards clinical translation, the requirements for standardized reporting of AI/ML studies and characterization of model generalization in different target populations will become increasingly important. In summary, SSII remains a rapidly growing field that increasingly combines a broad range of available imaging and sensor technologies with a significantly rising number of innovative machine learning and AI-based approaches. The main emerging trends can be traced back to the extraction of new or additional information from SSII applications by providing access to larger data sources and repositories, which are urgently needed for the development and validation of new image and signal processing tools. Translating methods into clinical application remains a challenge as new algorithms and tools need to demonstrate their clinical validity by proving scientific validity, analytical validity, and clinical performance, in particular when approaching regulatory approval as software as medical device. This year, we also optimized the process of searching the literature for candidate best papers, which -in our opinion -led to a further increase in the quality of selected papers. This is, of course, an ongoing process and we strive to improve it year on year. Last but not least, 2020 was mainly shaped by the light of the COVID-19 pandemic. Numerous innovations in SSII in the fight against the pandemic have already contributed to improved patient management in this globally unique and challenging situation. the smart vehicle. Sensors 2020;20 (9) Label noise is unavoidable in many medical image datasets. It can be caused by limited attention or expertise of the human annotator, the subjective nature of labeling, or errors in computerized labeling systems. This is especially concerning for medical applications where datasets are typically small, labeling requires domain expertise and suffers from high inter-and intra-observer variability, and erroneous predictions may influence decisions directly impacting human health. The authors reviewed the state-of-the-art label noise handling in deep learning and investigated how these methods were applied to medical image analysis. Their key recommendations to account for label noise are: label cleaning and pre-processing, adaptions on network architectures, the use of label-noise-robust loss functions, re-weighting data, label consistency checks, and the choice of training procedures. They underpin their findings with experiments on three medical datasets where label noise was introduced by the systematic error of a human annotator, the inter-observer variability, or the noise generated from an algorithm. Wireless capsule endoscopy (WCE) is an established examination method for the diagnosis of small-bowel diseases. Automated detection and classification of protruding lesions of various types from WCE images is still challenging because it takes 1 to 2 hours on average for a correct diagnosis by a physician. In this work, a deep neural network architecture, termed single shot multibox detector (SSD) based on a deep convolutional neural network (CNN) structure with 16 or more layers, was trained on 30,584 WCE images from 292 patients collected from multiple centers and tested on an independent set of 17,507 images from 93 patients, including 7507 images of protruding lesions from 73 patients. All regions showing protruding lesions were manually annotated by six independent expert endoscopists, representing the ground truth for training the network. The CNN performance was evaluated by a ROC analysis, revealing an AUC of 0.911, a sensitivity of 90.7%, and a specificity of 79.8% at the optimal cut-off value of 0.317 for the probability score. In a subanalysis of the categories of protruding lesions, the sensitivities appeared between 77.0% and 95.8% for the detection of polyps, nodules, epithelial tumors, submucosal tumors, and venous structures, respectively. In individual patient analyses, the detection rate of protruding lesions was 98.6%. The rates of concordance of the labeling by the CNN and three expert endoscopists were between 42% and 83% for the different morphological structures. A false positive/negative error analysis was reported, indicating some limitations of the current approach in terms of an imbalanced number of cases, color diversity, and variation of structures in the images. The work is notable for its excellent clinical applicability using a new computer-aided system with good diagnostic performance to detect protruding lesions in small-bowel capsule endoscopy. Sensor Informatics and Quantified Self Unobtrusive health monitoring in private spaces OR "physiological signal" OR "physiological signals" OR "blood pressure" OR "temperature" OR "heart rate" OR "heartbeat" OR "heartbeats" OR "pulse rate" OR "respiration rate" OR "respiratory rate" OR "breathing rate" OR "ECG" OR "electrocardiography" OR "electrocardiogram" OR "menstrual cycle" OR "oxygen" OR "oximetry" OR "glucose" OR "end-tidal" OR "emg" OR "electromyography" OR "electromyogram" OR "ppg" OR "photoplethysmography" OR "photoplethysmogram" OR "pcg" OR "phonocardiography" OR "phonocardiogram" OR "bcg" OR "ballistocardiography" OR "ballistocardiogram" OR "scg" OR "seismocardiography" OR "seismocardiogram OR "phonocardiogram" OR "bcg" OR "ballistocardiography" OR "ballistocardiogram" OR "scg" OR "seismocardiography" OR "seismocardiogram" OR "eog" OR "electrooculography" OR "electrooculogram" OR "eda" OR "electrodermal activity" OR "Respiration" OR "Blood Pressure" OR "eeg" OR "electroencephalogram" OR "bci" OR "brain computer interface" ) AND ( "processing" OR "analytics" OR "analysis" OR "analyse" OR "analyze" OR "analysing" OR "analyzing" OR "enhancement" OR "enhancements" OR "segmentation" OR "feature extraction" OR "feature selection" OR "classification" OR "clustering" OR "measurement" OR "quantification" OR "registration" OR "recognition" OR "reconstruction" OR "interpretation" OR "retrieval" "augmentation" OR "data mining" OR "computer-assisted" OR "computer-aided" OR "artificial intelligence" OR "machine learning" OR "deep learning" OR "neural network" OR "computer vision" OR "autoencoder" OR "auto-encoder" OR "Botzmann" OR "U-net" OR "support vector machine" OR "SVM" OR "random forest" ) ) AND PUBDATETXT Scientific Integrity Review"[pt] NOT "Systematic Review OR "MRI" OR "echocardiography" OR "sonography" OR "ultrasound" OR "endoscopy" OR "arthroscopy" OR "bronchoscopy" OR "colonoscopy" OR "cystoscopy" OR "laparoscopy" OR "nephroscopy" OR "laryngoscopy" OR "funduscopy" OR "thermography" OR "photography" OR "arthroscopy" OR "microscopy" ) AND ( "processing" OR "analytics" OR "analysis" OR "analyse" OR "analyze" OR "analysing" OR "analyzing" OR "enhancement" OR "enhancements" OR "segmentation" OR "feature extraction" OR "feature selection" OR "classification" OR "clustering" OR "measurement" OR "quantification" OR "registration" OR "recognition" OR "reconstruction" OR "interpretation" OR "retrieval" "augmentation" OR "data mining" OR "computer-assisted" OR "computer-aided" OR "artificial intelligence" OR "machine learning" OR "deep learning" OR "neural network" OR "computer vision" OR "autoencoder" OR "auto-encoder" OR "Botzmann" OR "U-net" OR "support vector machine" OR "SVM" OR "random forest" ) ) AND PUBDATETXT The section editors would like to thank Adrien Ugon for supporting the external review process and the external reviewers for their input on the candidate best papers. The queries used to retrieve literature from PubMed and Scopus differ slightly, as the databases do not use the same data fields and query syntax. A separate query was performed for sensors, signals, and imaging informatics.