key: cord-0759683-03m9bgc6 authors: Rodrigues, Valquiria C.; Soares, Juliana C.; Soares, Andrey C.; Braz, Daniel C.; Melendez, Matias Eliseo; Ribas, Lucas C.; Scabini, Leonardo F.S.; Bruno, Odemir M.; Carvalho, Andre Lopes; Vieira Reis, Rui Manuel; Sanfelice, Rafaela C.; Oliveira, Osvaldo N. title: Electrochemical and optical detection and machine learning applied to images of genosensors for diagnosis of prostate cancer with the biomarker PCA3 date: 2020-08-07 journal: Talanta DOI: 10.1016/j.talanta.2020.121444 sha: bd7b2125255327f392ac8cdb67e20a196e21fe1d doc_id: 759683 cord_uid: 03m9bgc6 The development of simple detection methods aimed at widespread screening and testing is crucial for many infections and diseases, including prostate cancer where early diagnosis increases the chances of cure considerably. In this paper, we report on genosensors with different detection principles for a prostate cancer specific DNA sequence (PCA3). The genosensors were made with carbon printed electrodes or quartz coated with layer-by-layer (LbL) films containing gold nanoparticles and chondroitin sulfate and a layer of a complementary DNA sequence (PCA3 probe). The highest sensitivity was reached with electrochemical impedance spectroscopy with the detection limit of 83 pM in solutions of PCA3, while the limits of detection were 2000 pM and 900 pM for cyclic voltammetry and UV-vis spectroscopy, respectively. That detection could be performed with an optical method is encouraging, as one may envisage extending it to colorimetric tests. Since the morphology of sensing units is known to be affected in detection experiments, we applied machine learning algorithms to classify scanning electron microscopy images of the genosensors and managed to distinguish those exposed to PCA3-containing solutions from control measurements with an accuracy of 99.9%. The performance in distinguishing each individual PCA3 concentration in a multiclass task was lower, with an accuracy of 88.3%, which means that further developments in image analysis are required for this innovative approach. sequence (PCA3). The genosensors were made with carbon printed electrodes or quartz coated with layer-by-layer (LbL) films containing gold nanoparticles and chondroitin sulfate and a layer of a complementary DNA sequence (PCA3 probe). The highest sensitivity was reached with electrochemical impedance spectroscopy with the detection limit of 83 pM in solutions of PCA3, while the limits of detection were 2000 pM and 900 pM for cyclic voltammetry and UV-vis spectroscopy, respectively. That detection could be performed with an optical method is encouraging, as one may envisage extending it to colorimetric tests. Since the morphology of sensing units is known to be affected in detection experiments, we applied machine learning algorithms to classify scanning electron microscopy images of the genosensors and managed to distinguish those exposed to PCA3-containing solutions from control measurements with an The search for new diagnostic methodologies has gained tremendous impetus with the Covid-19 pandemic outbreak in 2020, for it has become clear that low-cost, easily deployable tests are crucial for humanity. Three main challenges have to be faced to fulfill such stringent requirements: the sensing units must be cheap and easy to manufacture even in developing countries; the principle of detection should be simple without requiring highly trained personnel to operate the measuring equipment; data analysis should be robust and fast. Much has been done in all of these challenges, as can be easily confirmed in the recent literature for various types of biosensors (see e.g. some review papers) [1] [2] [3] [4] [5] . However, this considerable body of knowledge has not been transformed into products for various reasons, the most important of which is perhaps the high cost of device engineering to develop tests and certify them through government agencies. This is particularly the case of neglected diseases or of diseases in which the number of tests to be commercialized would not justify the large investments. We advocate, nevertheless, that efforts should be made to develop sensing technology that is sufficiently generic to leverage progresses in different areas, across different diseases and for monitoring health conditions. Moreover, there are diseases for which such new methodologies are urgent. This applies to prostate cancer [6] , which is rarely symptomatic as the tumor grows quietly and the failure to detect early makes this disease the second cause of death in men in industrialized countries [7] . Today, prostate cancer is diagnosed with a combination of a blood test to detect the prostate specific antigen (PSA) and rectal examination [8] . Unfortunately, in spite of its high sensitivity, the PSA test has low specificity, thus resulting in many negative biopsies, i.e. unnecessary and invasive procedures [9] [10] . An increased PSA concentration may arise from non-cancerous conditions such as prostate infections, prostate enlargement and J o u r n a l P r e -p r o o f even recent sexual activity [11] . This state of affairs may change significantly if more specific biomarkers are found. An important candidate is the prostate cancer gene 3 (PCA3) located on the chromosome 9q21-22 [12] - [13] , which is prostate-specific and associated with prostate cancer [14] [15] [16] [17] . PCA3 was identified in 1995, [18] being initially called DD3 (Differential Display Clone 3) since differential display analysis was used to compare mRNA expression patterns of normal versus prostate cancer tissues [19] . Overexpression of PCA3 gene was observed in 95% of prostate cancer samples, while gene 3 expression was not detected in any other normal or multi-organ tumor tissues. In benign altered prostate cells, very low levels of gene expression were detected [20] [21] . PCA3 biomarker is specific for prostate cancer and there is no cut-off concentration. Even low concentrations of PCA3 are indicative of a high probability that a patient has (or will) developed prostate cancer. There are a few reports of sensors to detect PCA3 [15, 20, 22, 23] including our own recent work [22] , which is the only one using a simple principle of detection to the best of our knowledge. In this study, we build upon this previous work to address the last two challenges of the three mentioned in the beginning of this introduction. More specifically, we show that genosensors can be built with simple manufacturing processes and applied with varied principles of detection. We show that PCA3 can be detected using electrochemical methods, optical absorption spectroscopy and through image analysis of the sensing devices. Furthermore, the matrix onto which the DNA sequence (PCA3 Probe) is immobilized differs from our previous work, as we incorporated gold nanoparticles along with chitosan and chondroitin sulphate to enhance the electrical signal. As for data analysis we employ information visualization techniques for the electrochemical impedance data, with which the highest sensitivity was obtained, and machine learning methods to classify the images taken from the J o u r n a l P r e -p r o o f genosensing units. The overall aim was to obtain a generic platform in terms of materials, devices and data analysis, which can be replicated to other biomarkers and other diseases. The genosensors were made with layer-by-layer (LbL) films [24] wafer. This nanoparticles size is compatible with data in the literature where the reducing agent was also borohydride [26] . Nanoparticle composition was confirmed with elemental analysis in Fig. S2b , whose elements are listed in Table S1 . The LbL assembly on the carbon electrode was carried out as follows. The Polarization-modulated infrared reflection absorption spectroscopy (PM-IRRAS) was used to determine the chemical groups involved in LbL film formation, using a KSV PMI 550 equipment with 8 cm -1 spectral resolution and 80° incidence angle. Cyclic voltammetry (CV) and electrochemical impedance spectroscopy were employed to characterize the LbL films and to detect the biomarker with a PGSTAT 204, Autolab electrochemical system (Eco Chimie, Netherlands), controlled by NOVA software. For CV the potential range was from -0.6 to 0.6V. Detection was also performed with UVvis spectroscopy using a Hitachi U-2001 spectrophotometer, where the band responsible J o u r n a l P r e -p r o o f for hybridization at 260 nm was analyzed. The reproducibility of the genosensor was tested with triplicate measurements. The data acquired in impedance spectroscopy measurements were visualized with the multidimensional projection technique referred to as interactive document mapping (IDMAP) [27] , in which each spectrum is plotted as a data instance on a 2D plot. This technique for reducing the dimensionality of data has been proven excellent for biosensors [28] and is based on preserving the similarity from the objects (spectra in this case) in the high dimension space in the projected space. It employs the optimization function in Equation (1 layer for electrical contact and image generation. The Pt layer was thin enough so that Pt was not incorporated into the sample, but sufficient to maintain electrical contact [29] . After coating, the samples were placed in a vacuum chamber to eliminate moisture. In the analysis with machine learning a set of 32 SEM images was employed, corresponding to sensing units that were subjected to distinct concentrations of PCA3 in J o u r n a l P r e -p r o o f addition to the negative sequence (non-complementary) and a blank measurement for control as indicated in Table 1 , thus leading to a 8-class problem. All images have 8-bit resolution (gray scale) and they were taken with different sizes in terms of number of pixelsand scales (200 nm and 300 nm), which allowed a semi-systematic study of the effects from the size of the images. The entire data for this set of images are available in the Supporting Information. Image features were extracted in step 1 using texture analysis techniques. Texture is a key element of human visual perception, used in many computer vision systems and for a variety of applications. [30] In this step, we employed for feature extraction the following texture analysis techniques: Gray Level Difference Matrix (GLDM) [31] , Fourier descriptors [32] , Complex Network Texture Descriptor (CNTD) [33] , Fractal descriptors [34] , Adaptative Hybrid Pattern (AHP) [35] , Local Binary Patters (LBP) [36] , Complex Network and Randomized Neural Network (CNRNN) [37] and Local Complex Features and Neural Network (LCFNN) [37] . These techniques analyze texture information in different ways (using models, statistics, spectra, and learning) and are suitable for a small number of samples in the dataset, also providing fast results. The image features (feature vectors) obtained from the images using texture analysis were classified (step 2) using the non-supervised machine learning technique referred to as t-Distributed Stochastic Neighbor Embedding (t-SNE) and 3 supervised machine learning algorithms. Two types of classification were J o u r n a l P r e -p r o o f executed, viz. a binary classification between the samples exposed to PCA3 (positive) and those that were not (negative and blank), and a multiclass classification where the distinct PCA3 concentrations were considered. For supervised machine learning, the texture analysis techniques and classifiers were evaluated in performance and generalization capacity using the average accuracy and standard deviation of the 100 random trials. In each trial we adopted a 10-fold cross-validation scheme to separate the test and training sets. In this scheme, 1-fold is used for testing and the 9-folds remaining are employed to train the classifier; this procedure is repeated using all folds for testing. The parameters of the texture analysis techniques and classifiers were kept with the standard values according to the original paper for each technique. The immobilization of the PCA3 probe on carbon electrodes led to a slight shift in the oxidation peak to more positive potentials in Fig. S4 (a), though the area within the voltammograms did not change. A small increase in resistance from 3.4 kΩ to 3.6 kΩ was inferred from the Nyquist plot in Fig. S4 (b), consistent with the cyclic voltammetry measurements. This adsorption could be visualized in the SEM images of Hybridization between the PCA3 probe and PCA3 also affects the electrochemical impedance spectroscopy data, as seen in Fig. 4 , especially at low frequencies where the electrical response is dominated by the electrical double layer. [22] J o u r n a l P r e -p r o o f A calibration curve was built from the data in Fig. 4 for the impedance at 30 Hz where one notes in Fig. 5 a sharp increase at low PCA3 concentrations before stabilizing when the active sites available for hybridization tend to zero. This curve can be taken as an adsorption isotherm, and has been fitted with the composition of two Freundlich isotherms. The limit of detection was 83 pM, being therefore more sensitive than using cyclic voltammetry (above) and in a recent work where chitosan and carbon nanotubes were utilized as immobilization matrix, for which the detection limit was 128 pM [22] . The control experiments with electrochemical impedance measurements in quartz. Fig. 7 shows the absorption spectra for the genosensors exposed to different PCA3 concentrations. The band at 260 nm is assigned to absorption of DNA bases of J o u r n a l P r e -p r o o f the PCA3 probe and its intensity decreases with increasing PCA3 concentration. [41] This decrease is due to the so-called hypochromic effect, [42] explained as follows. The close interaction between stacked bases in nucleic acids causes a decrease in UV light absorption compared to that the absorption of a solution with the same concentration of free nucleotides. It is worth mentioning that this hypochromic effect does not occur when the genosensor is exposed to the non-complementary sequence, as seen in Fig. S7 in the Supporting Information. All the techniques employed here for PCA3 gene detection were successful, with electrochemical impedance being the most sensitive with the lowest detection limit. The sensitivity even with the most efficient method is not as high as the one obtained by Fu and colleagues [20] who employed surface-enhanced Raman scattering (SERS) with a device using a PCA3 mimic. The genosensor developed here is nevertheless attractive because it has the advantage of ease of production with low cost; in addition, no J o u r n a l P r e -p r o o f sophisticated equipment is required unlike the more sensitive SERS sensor [20] . In future studies the genosensor will be applied to urine samples from patients. One of the easiest ways to obtain a fast diagnostic with present technology is to take a picture of a sensing unit exposed to the sample and process the image. This is different from the standard approaches involving image analysis for diagnostics because the image is taken not from the biological sample itself but of the sensing unit. Hence, this strategy will only work if the detection procedure leads to a change in texture or morphology or any other image feature of the sensing units. Since it is well established that the surfaces of biosensors are altered during the measurements, it seems natural to assume that such changes could be utilized for diagnosis. Yet, this strategy has not been explored in the literature. To the best of our knowledge the only work based on image analysis of sensing units is our own [29] in which we proved that the standard deviation of the circularity of objects on SEM images correlated with the concentration of a cancer biomarker. We have therefore decided to extend this research and apply machine learning to the images of genosensors subjected to the same procedures as in the electrochemical and optical detection in the previous subsection. A typical set of representative images are illustrated in Fig. 9 . We used SEM imagesfor proof-of-concepts experiments, though we know ideally one should use optical images. For if the strategy does not work with SEM images that are expected to capture the nanoscopic changes owing to hybridization in the genosensors, it is unlikely that it would work with optical images. The classification was performed in 3 datasets, which differs in number of examples (Table S2) Table 2 were obtained using the dataset with the windows size of 300x300 pixels (the results for the two other datasets are given in Tables S4 and S5). In the table, (1-NN) . The maximum accuracy was 99.9 (0.3) using the LCFNN descriptor with SVM and LDA classifiers in the binary classification. In this procedure, the classification system had the task to distinguish the images of sensing units exposed to PCA3 (with all concentrations put together) from those which were not (negative and blank). Thus, the accuracy indicates a strong ability to separate the two classes. [44] , which should be entirely free from overfitting artifacts. This technique is used for high-dimensional data visualization because of its ability to reveal the data structure, such as clusters of similar samples. The same feature vectors employed above were now embedded in a two-dimensional space with t-SNE, which does not require class information. Fig. 10 shows Table 3 are smaller than those in Table 2 , with a maximum of 70.83% for the multiclass scenario. This is indeed a more challenging task, which requires more sophisticated and complex image analysis techniques. J o u r n a l P r e -p r o o f We designed a genosensor made with LbL films coated with a layer of a PCA3 probe which proved effective in detecting PCA3 with different principles of detection. The most effective was electrochemical impedance spectroscopy, with which a limit of detection of 83 pM was reached, being more sensitive than the genosensor from our previous work 22 Taken together, we believe that the results and concepts reported here may pave the way for a new era of diagnostics -not only for prostate cancer -where simple detection methods may be employed which can be leveraged with machine learning of images of the sensing units themselves. J o u r n a l P r e -p r o o f Biosensors on chip: A critical review from an aspect of micro/nanoscales Biosensors for early diagnosis of pancreatic cancer: a review A microfluidic based biosensor for rapid detection of Salmonella in food products Wireless chemical sensors and biosensors: A review Current advances and future visions on bioelectronic immunosensing for prostate-specific antigen New Strategies in Prostate Cancer: Targeting Lipogenic Pathways and the Energy Sensor AMPK Prostatic specific antigen for prostate cancer detection PCA3: From basic molecular science to the clinical lab PCA3 score and prostate cancer diagnosis at repeated saturation biopsy. Which cut-off: 20 or 35? PCA3 Molecular Urine Assay for Prostate Cancer in Men Undergoing Repeat Biopsy Carbon Nanotubes: An Emerging Drug Carrier for Targeting Cancer Cells Prostate cancer antigen 3 (PCA3) RNA detection in blood and tissue samples for prostate cancer diagnosis Carbon Nanomaterial Based Biosensors for Non-Invasive Detection of Cancer and Disease Biomarkers for Clinical Diagnosis Graphene Oxide-Upconversion Nanoparticle Based Optical Sensors for Targeted Detection of mRNA Biomarkers Present in Alzheimer's Disease and Prostate Cancer The novel prostate cancer antigen 3 (PCA3) biomarker The diagnostic value of PCA3 gene-based analysis of urine sediments after digital rectal examination for prostate cancer in a Chinese population Interview with Jack Schalken The use of PCA3 in the diagnosis of prostate cancer Highly sensitive detection of prostate cancer specific PCA3 mimic DNA using SERS-based competitive lateral flow assay Approaches to urinary detection of prostate cancer Detection of the Prostate Cancer Biomarker PCA3 with Electrochemical and Impedance-Based Biosensors New biomarkers for diagnosis and prognosis of localized prostate cancer Fuzzy Nanoassemblies: Toward Layered Polymeric Multicomposites Electrostatic and electrosteric stabilization of aqueous suspensions of barite nanoparticles Preparação de nanopartículas de prata e ouro: um método simples para a introdução da nanociência em laboratório de ensino Information visualization techniques for sensing and biosensing Immunosensors made with layer-by-layer films on chitosan/gold nanoparticle matrices to detect D-dimer as biomarker for venous thromboembolism Analysis of Scanning Electron Microscopy Images To Investigate Adsorption Processes Responsible for Detection of Cancer Biomarkers Texture Feature Extraction Methods: A Survey A Comparative Study of Texture Measures for Terrain Classification Texture Classification with Generalized Fourier Descriptors in Dimensionality Reduction Context: An Overview Exploration Texture analysis and classification: A complex network-based approach Plant Leaf Identification Based on Volumetric Fractal Dimension An adaptive hybrid pattern for noise-robust texture analysis Multiresolution gray-scale and rotation invariant texture classification with local binary patterns Fusion of complex networks and randomized neural networks for texture analysis Vibrational spectroscopy at electrified interfaces Introduction to infrared and Raman spectroscopy Microfluidic-Based Genosensor To Detect Human Papillomavirus (HPV16) for Head and Neck Cancer Lehninger principles of biochemistry Statistical pattern recognition Visualizing data using t-SNE A survey of deep neural network architectures and their applications