key: cord-0030914-crd1cx93 authors: Fidler, Gábor; Szilágyi-Rácz, Anna Anita; Dávid, Péter; Tolnai, Emese; Rejtő, László; Szász, Róbert; Póliska, Szilárd; Biró, Sándor; Paholcsek, Melinda title: Circulating microRNA sequencing revealed miRNome patterns in hematology and oncology patients aiding the prognosis of invasive aspergillosis date: 2022-05-03 journal: Sci Rep DOI: 10.1038/s41598-022-11239-z sha: b4976a7a707f04e93468ffb1dfe73df180004554 doc_id: 30914 cord_uid: crd1cx93 Invasive aspergillosis (IA) may occur as a serious complication of hematological malignancy. Delays in antifungal therapy can lead to an invasive disease resulting in high mortality. Currently, there are no well-established blood circulating microRNA biomarkers or laboratory tests which can be used to diagnose IA. Therefore, we aimed to define dysregulated miRNAs in hematology and oncology (HO) patients to identify biomarkers predisposing disease. We performed an in-depth analysis of high-throughput small transcriptome sequencing data obtained from the whole blood samples of our study cohort of 50 participants including 26 high-risk HO patients and 24 controls. By integrating in silico bioinformatic analyses of small noncoding RNA data, 57 miRNAs exhibiting significant expression differences (P < 0.05) were identified between IA-infected patients and non-IA HO patients. Among these, we found 36 differentially expressed miRNAs (DEMs) irrespective of HO malignancy. Of the top ranked DEMs, we found 14 significantly deregulated miRNAs, whose expression levels were successfully quantified by qRT-PCR. MiRNA target prediction revealed the involvement of IA related miRNAs in the biological pathways of tumorigenesis, the cell cycle, the immune response, cell differentiation and apoptosis. Globally, the incidence of fungal infections is evidenced by the worrisome prevalence values of approximately 20 million cases of allergic fungal diseases and more than 1 million cases of invasive fungal infections (IFIs) 1, 2 . IFIs are associated with dramatic mortality rates, ranging from 20 to 50% despite currently available powerful antifungal agents 3, 4 . Underscoring the burden of invasive aspergillosis (IA), a marked increase in disease prevalence was observed due to improved diagnostics, an overall escalation in the use of immunosuppressive therapies, and an increased number of organ transplantations performed in recent decades 5, 6 . IA remains a major issue among patients who have undergone either stem cell or solid organ transplantation, with a prevalence of over 10% [7] [8] [9] [10] . Considering the impact of the severity of infection, mold specific nucleic acid biomarkers and galactomannan antigen (GM) may prove to be valuable for a timely disease diagnosis. Because of devastating statistics and high mortality rates, new and alternative diagnostic strategies are needed. To diagnose patients with IA in a timely manner, there is a comprehensive need to identify biomarkers with high specificity and sensitivity. Moreover, the application of minimally invasive procedures to obtain nucleic acid targets has become a research trend. Ultimately, biomarkers must be easily detectable with satisfactory positive and negative predictive values and must also discriminate hematology and oncology (HO) patients with or without IA. MicroRNAs (miRNAs) are a class of typically small noncoding RNAs that can regulate gene expression posttranscriptionally through miRNA::mRNA interactions. By mediating the degradation of specific mRNAs, Quantitative analysis of the small noncoding RNA transcriptome revealed shared and unique miRNAs. In this study, high-throughput small RNA sequencing followed by in silico data analysis was used to detect unique and conserved circulating miRNAs in the study cohort, including healthy controls (n = 24) and HO patients with (HO-proven IA; n = 4, HO-probable IA; n = 3) or without (HO-possible IA; n = 19) IA. In total, 735 miRNAs were omitted from the analysis due to a very low read number (read per million [RPM < 10]) across all samples. We identified 364 miRNAs, with a read number above 10 (RPM > 10). We focused on these in our following analyses. Venn diagram was created to represent the number of miRNAs that were shared ("intersec- www.nature.com/scientificreports/ tions") and unique) between different datasets is ( Fig. 1) . Small RNA transcriptome compositions exhibited remarkable differences between our experimental groups (Fig. 1a) . Overall, 190 miRNAs were uniformly present in all experimental groups, representing 19 .02% of all identified miRNAs. By considering the global expression level distribution profiles of the common miRNAs, considerable differences were detected when comparing healthy controls to HO patients with or without IA (Fig. 1b) . As shown, IA patient group exhibits remarkable expression changes in several miRNA read numbers. Analyses of the expressed conserved miRNAs revealed that most genes were uniformly up-or downregulated in the non-IA patient group. We also identified unique miRNAs in different experimental groups ( Supplementary Fig. 1 ). In total, 21 and 20 miRNAs were present exclusively in healthy and non-aspergillosis HO controls. Based on our data we found 41 miRNAs that were presented in hemato-oncology patients with proven/probable IA. Of these, 21 were present in patients with proven IA (HO-proven), whereas 17 were present in patients with possible IA (HO-probable). DEMs in HO patients with IA. Differential expression analysis was performed by retrieving the expressed reads of the 190 conserved miRNAs. Multiple miRNAs showed remarkable differences in expression when comparing HO patients with (HO-proven, HO-probable) or without (HO-possible) IA. Volcano plots were generated to identify the miRNAs showing fold differences with high statistical significance (P values ≤ 0.05) and expressing log 2 -fold changes greater than 1 and lower than − 1 (− 1 > fold change < 1) using the LIMMA statistical model (Fig. 2 ). Based on these criteria, which were considered stringent, we were able to reduce the number of conserved miRNAs to 57. Thereafter, we further identified 21 miRNAs in the IA group, whose miRNA expression profile was significantly different (twofold change with P < 0.05) in comparison to non-IA patients. Hereaf- red, orange and blue correspond to miRNAs with high (RPM > log 10 4), medium (log 10 2 < RPM < log 10 4) and low (log 10 1 < RPM < log 10 2) read per million values, respectively. In every case the order of the miRNAs (the representative bars) was the same. Bar lengths refer to the log10 RPM sequence number. Results of differential expression analysis of the 190 conserved miRNAs by comparing HO patients with (HO-proven, HO-probable) or without (HO-possible) IA. Volcano plot represents the DEMs showing statistically significant overexpression and underexpression (according to the log 2 -transformed fold change in relation to the -log 10-transformed P-value). The dashed line on the y-axis indicates the P-value = 0.05 threshold with statistically significant (P < 0.05) gene expression (up-and downregulation, respectively). Red circles indicate DEMs. Highlighted DEMs represent IA-specific miRNAs whose gene expression was not influenced by hematological malignancies. Differential expression analysis of the circulating DEMs led to the identification of distinct clusters. The DEM patterns were also clustered to confirm the diagnostic potential of circulating miRNA signatures due to IA disease progression. A hierarchically clustered heatmap was constructed by relating the log 2 -fold change expression values of the 36 DEMs in patients with IA to those in healthy volunteers (Fig. 3 ). Of these miRNAs, 15 (hsa-miR-16-2-3p, hsa-miR-342-5p, hsa-miR-32-5p, hsa-miR-26b-5p, hsa-miR-223-5p, hsa-miR-26a-5p, hsa-miR-625-3p, hsa-let-7a-5p/7c-5p, hsa-miR-92a-3p, hsa-miR-7706, hsa-miR-423-3p, hsa-miR-130b-5p, hsa-miR-423-5p, hsa-let-7b-5p, hsa-miR-486-5p) were significantly upregulated while 21 (hsa-miR-181b-5p, hsa-miR-152-3p, hsa-miR-23a/b-3p, hsa-miR-324-5p, hsa-miR-185-5p, hsa-miR-30a-5p, hsa-miR-130a-3p, hsa-miR-130b-3p, hsa-miR-191-5p, hsa-miR-361-5p, hsa-miR-93-3p, hsa-miR-339-5p, hsa-miR-103a-3p, hsa-miR-15a-5p, hsa-miR-20a-5p, hsa-miR-93-5p, hsa-miR-106a-5p/17-5p, hsa-miR-20b-5p, hsa-miR-221-3p, hsa-miR-106b-5p, hsa-miR-500a-3p) were downregulated due to IA. Three miRNAs (hsa-miR-1976, hsa-miR-423-5p, hsa-let-7b-5p) exhibited inconsistent expression patterns in IA patients. Beta diversity relationships are summarized in two-dimensional multi-dimensional scaling (MDS) scatterplots (Fig. 4) . Each point represents a sample, and distances between points are representative of differences in DEM expression. Diversity plots were generated to represent the DEM-induced alterations discriminating IA patients from controls, resulting in nonoverlapping clusters (cluster 1 and cluster 2) and representing different spatial ordinations. The MDS plot shows that on the basis of the expression patterns of the IA-related miRNA signatures, it is possible to discriminate patients (HO-proven, and HO-probable IA) from noninfected (HOpossible IA and H, healthy) controls. Validation of the DEMs. An essential component of reliable quantitative reverse transcription PCR (qRT-PCR) analyses is the normalization of gene expression data because it controls for variations and allows comparisons of gene expression levels among different samples. An ideal reference gene must be stably expressed, abundant and without any significant variation in its expression status 20 . Due to high heterogeneity, there is no consensus for the best reference gene to be used to normalize miRNA gene expression data in HO patients. In Figure 3 . DEMs measured in the whole blood of HO patients diagnosed with or without IA. Hierarchical clustering was performed using the log 2 -transformed relative read counts of the 36 DEMs selected on the basis of the differential expression analysis in HO patients demonstrating a fold change difference greater than 2, P < 0.01 relative to healthy controls. Euclidian distances revealed two main clusters: Leaf 1 and Leaf 2. Leaf 1 represents patients with high-risk IA (HO-proven IA and HO-probable IA). Leaf 2 represents patients with lowrisk IA (HO-possible IA). www.nature.com/scientificreports/ this study, 20 candidate reference genes were investigated to normalize the RT-qPCR data, and their stability was evaluated. On the basis of the overall ranking data, hsa-miR-181a-5p was found to be the most stable, showing the highest stability among the 20 tested miRNAs ( Supplementary Fig. 2 ). Of the 62 most abundant DEMs tested, 14 miRNAs were validated successfully by qRT-PCR across our sample groups. The 2-ΔΔCT method was used to quantify the relative fold changes in gene expression in patients (HO-proven and HO-probable vs. HOpossible) relative to healthy controls. To calculate relative changes in gene expression, for each sample, the normalized CT values of single miRNAs were related to the mean CT values measured in healthy controls according to Livak's 2-ΔΔCT method (Fig. 5a ). Based on these results, we found that the gene expression of 14 miRNAs (hsa-miR-191-5p, hsa-miR-106b-5p, hsa-miR-16-2-3p, hsa-miR-185-5p, hsa-miR-26a-5p, hsa-miR-26b-5p, hsa-miR-106b-3p, hsa-miR-15a-5p, hsa-miR-20a-5p, hsa-miR-20b-5p, hsa-miR-106a-5p, hsa-miR-103a-5p, hsa-miR-93-5p, hsa-miR-17-5p) exhibited significant changes due to IA. To strengthen the congruent gene expression tendencies of small RNA-seq data and qRT-PCR measurements, the normalized read counts (in RPM) of IA patients relative to healthy controls with their density distributions were also determined throughout the IA-infected (HO-proven and HO-probable IA) vs. noninfected (HOpossible IA) hematology and oncology patients (Fig. 5b ). To estimate the capabilities of DEMs to discriminate aspergillosis-infected and noninfected patients from whole blood samples, receiver operating characteristic (ROC) curve analyses were applied (Fig. 6 ). On the basis of qRT-PCR-validated gene expression analyses, eight DEMs were found to display high discriminatory power (hsa-miR-191-5p, hsa-miR-106b-5p, hsa-miR-16-2-3p, hsa-miR-26a-5p, hsa-miR-15a-5p, hsa-miR-20a-5p, hsa-miR-106a-5p and hsa-miR-17-5p). All of these miRNAs were downregulated in the IA confirmed group, representing statistically significant fold changes (P < 0.05) relative to noninfected controls. Five miRNAs (hsa-miR-191-5p, hsa-miR-106b-5p, hsa-miR-15a-5p, hsa-miR-20a-5p, hsa-miR-106a-5p) demonstrated excellent discriminatory power, with AUC values of 1. Three additional miRNAs (hsa-miR-16-2-3p, hsa-miR-26a-5p and hsa-miR-17-5p) displayed AUC values greater than 98%. In addition to examining the distribution of the CT values and the discriminatory power of the miRNAs, normalized CT values for cases (proven and probable IA) and controls (possible IA) were also dichotomized by mapping the sensitivity values in relation to 1-specificity to estimate the optimal cutoff values for these biomarkers. In every case, we also estimated the optimal cutoffs, defined as the points that maximized sensitivity and specificity. Computational prediction reveals genes and biological functions affected by dysregulated miRNAs. The biological effects of miRNAs depend on various factors. Predicted interactions were retrieved from the integrated databases. Target recognition refers to the process by which mature miRNAs recognize their www.nature.com/scientificreports/ complementary mRNA sequences and regulate gene expression. An online webtool algorithm, miRabel, was employed to predict the target genes or biological pathways related to the dysregulated miRNAs considering their evolutionary conservation, Watson-Crick complementarity, and thermodynamic properties between the seed region of the miRNA and its target mRNA 21 . On the basis of in silico data predictions, we generated a list of 55 target genes whose expression might be posttranscriptionally influenced by at least three IA-specific DEMs (Fig. 7) . Pathway analysis was performed with the KEGG database ( Supplementary Fig. 3 ). On the basis of the in silico pathway analyses, twelve relevant biological functions, "cell homeostasis", "trafficking/vascular transport", "extracellular matrix (ECM)", "cell adhesion", "cell differentiation", "cell cycle", "tumorigenesis", "apoptosis", "immune response", "infectious diseases", "synaptic plasticity" and "catabolic pathways", were found to be influenced by changes in the IA-affected miRNAs. Of these, tumorigenesis (27 hits), the cell cycle (20 hits), the immune response (17 hits), cell differentiation (14 hits) and apoptosis (13 hits) were the top 5 affected pathways. The associations of these miRNAs with the regulated genes of these pathways were experimentally proven by other previous studies . Representation of the fold changes in the qRT-PCR data and sequencing read numbers with their density distributions of validated DEMs measured in the whole blood of IA-infected and noninfected HO patients. (a) The downregulated gene expression of 14 DEMs was confirmed by qRT-PCR. Scatter plots represent the whole blood miRNA levels as relative miRNA concentrations with the formula 2-ΔCt (normalized to hsa-miR-181a-5p). Significant median differences in the miRNA levels between each group were determined by the nonparametric Mann-Whitney test (*P < 0.05, **P < 0.01, ***P < 0.001). (b) Density bars represent the normalized sequencing read numbers in patients relative to healthy controls, where the trend line indicates IA-infected (red) and noninfected (blue) patients. IFIs are a major cause of mortality in immunosuppressed patients. IA is the most common mold infection in immunocompromised hosts associated with a poor prognosis and high mortality if diagnosis is delayed. Missed diagnoses are encountered when appropriate diagnostic tools are not available, especially in low-income and middle-income areas 173 . Currently, the early detection of IA is very difficult because most patients have nonspecific symptoms, leading to postponement of a correct diagnosis and therapy. The identification of easily accessible, noninvasive, blood-born biomarkers at early stages of disease progression is crucial for the evaluation of high-risk subjects to establish follow-up strategies. Technological advances in high-throughput molecular methods have provided possibilities to detect miRNA expression patterns in different biological samples. Obtaining circulating miRNAs from the blood represents a minimally invasive method for the early detection of disease or to aid in treatment options. The discovery of disease-specific miRNA expression signatures is essential to obtain an accurate diagnosis and to better understand disease pathology. Blood is an easily obtained biofluid that can be used to identify biomarkers 174 . Considering the increasing evidence from the literature showing that the dysregulated expression of miRNAs plays a pivotal role in various infections, we proposed that certain circulating miRNAs may play a significant role in the outcome of IA, suggesting that their relative gene expression levels might also serve as indicators of disease progression 175, 176 . By performing small RNA sequencing, this study has undertaken a comprehensive exploratory evaluation to establish the full repertoire of circulating miRNAs in whole blood among critically ill patients at high risk of IFIs. Circulating miRNAs were also recently recognized as promising disease biomarkers in infectious diseases, but relatively few studies have examined their role in IA. The regulatory roles of hsa-miR-132-5p and hsa-miR-212-5p have been associated with fungal infections 18 . By considering baseline patient characteristics and underlying malignancies, our primary goal was to decipher aberrant miRNA expression patterns. We hypothesized that by comparing distinct miRNA-seq profiles of shared miRNAs between cases and controls, we can decipher specific prognostic markers that can aid in disease diagnosis. In this study, the most abundant, conserved miRNAs constituted 19.02% of the pool. Differential expression analysis was employed to systematically search the small RNA transcriptome data for a subset of circulating miRNAs representing the most promising combinations of DEMs. Of the potential DEMs, we identified a subset of miRNAs whose expression signatures are unlikely influenced by hematological malignancy but likely to indicators of IA infection. In miRNA-based biofluid analyses, when a continuous variable is considered a diagnostic marker, the method adopted for data normalization and the choice of the reference gene is very important. Using hsa-miR-181a-5p as a reference, we found that dysregulated hsa-miR-191-5p, Figure 6 . ROC curves were constructed to assess and visualize the performance of 8 preselected miRNAs. To measure the diagnostic accuracy of the miRNAs, relative fold changes were converted to qualitative (proven IA, probable IA vs. possible IA) indexes. The normalized CT values of eight miRNAs revealed their significant downregulation in IA-infected hematology and oncology patients (proven/probable) relative to noninfected patients (possible), indicating IA. Line graphs were used to calculate the optimal cutoff points. Scatter graphs represent the distribution of the CT values in cases and controls, and area plots represent the discriminatory power of the biomarkers. www.nature.com/scientificreports/ hsa-miR-106b-5p, hsa-miR-16-2-3p, hsa-miR-26a-5p, hsa-miR-15a-5p, hsa-miR-20a-5p, hsa-miR-106a-5p and hsa-miR-17-5p showed strong discriminatory power, with AUC values greater than 98%. Despite continued progress, target prediction of miRNAs remains a challenge, since aggregated databases often show inconsistent results. To date, approximately 3000 mature human miRNAs have been referenced in miRBase, but several recent studies suggest that there may be a larger number 177 . Furthermore, the bioinformatics identification of miRNA targets remains a challenge because mammalian miRNAs are characterized by poor homology toward their target sequence 21 . Confirmation of the potential biological relevance of these predicted targets is laborious, and it was not the goal of the current project. In relation to IA, the in silico analysis of miRNA-influenced genes suggested an enrichment of pathways associated with tumorigenesis, the cell cycle, the immune response, cell differentation and apoptosis. www.nature.com/scientificreports/ Interestingly, hsa-miR-16-2-3p was shown to have no influence on these genes, and hsa-miR-191-5p affected only the gene encoding the product of the microtubule-associated protein RP/EB family member 3 (MAPRE3). As a member of the transmembrane protein family, the product of the gene transmembrane protein 100 (TMEM100) was also experimentally proven to be involved in cell differentiation, apoptosis and synaptic plasticity [22] [23] [24] . Two genes, TMEM100 and MAPRE3, were epigenetically influenced by five miRNAs, and both were markedly targeted by hsa-miR-17-5p (TMEM100 miRabel score: 0.00056, MAPRE3 miRabel score: 0.00069), hsa-miR-20a-5p (TMEM100 miRabel score: 0.00048, MAPRE3 miRabel score: 0.0012), and hsa-miR-106b-5p (TMEM100 miRabel score: 0.00036, MAPRE3 miRabel score: 0.00108) and weakly targeted by hsa-miR-106a-5p (TMEM100 miRabel score: 0.0485, MAPRE3 miRabel score: 0.0488). Previous studies have also implied a direct link between TMEM100 and miR-106b-5p related to tumorigenesis [25] [26] [27] . Based on our data, dysregulated hsa-miR-17-5p, hsa-miR-20a-5p and hsa-miR-106b-5p target the signal transducer and activator of transcription 3 (STAT3) gene in HO-IA patients. The STAT3 gene encoding the transcription factor, which is a member of the STAT protein has also been proven to play an important regulatory role in both bacterial and fungal infectious diseases 28, 29 . A defect in the IFN-γ response in STAT3-deficient patients has already been proven upon stimulation with heat-killed Staphylococcus aureus and Candida albicans 30, 31 . In addition, the involvement of the tyrosine protein phosphatase nonreceptor type 4 protein, encoded by the PTPN4 gene, in infectious diseases was also proven that also plays a role in immunity and cell homeostasis [32] [33] [34] [35] [36] . We found that the PTPN4, STAT3 and RAP2C genes were the main targets with important roles in relevant biological processes. In humans, loss-of-function mutations of the STAT3 gene are frequently associated with susceptibility to bacterial as well as fungal infections 178 . Francois Danion and colleagues proved that STAT3-deficient patients with aspergillosis were associated with a defective adaptive immune response against A. fumigatus infection and produced lower levels of cytokines, including IFN-γ, IL-17, and IL-22 178 . Based on their estimations, one major protective host mechanism against A. fumigatus infection is via IFN-γ. Furthermore, a recent study showed that the majority of lung-derived T cells upon A. fumigatus infection were Th17 cells, suggesting that the decreased production of Th1 and Th17 cytokines in STAT3-deficient patients could be the reason for their susceptibility to A. fumigatus 179, 180 . The tumor suppressor protein encoding TMEM100 gene was found to be targeted by five IA-related miRNA biomarkers; hsa-miR-15a-5p, hsa-miR-17-5p, hsa-miR-20a-5p and hsa-miR-106a/b-5p. The fact that all of the miRNAs targeting TMEM100 have shown significant changes in gene expression in HO patients with aspergillosis also suggests its involvement in both potentially oncogenic and infection-related biological pathways 26 . Interestingly, in previous studies, the regulatory roles of some of these miRNAs were associated with infectious mycobacterial disorders. By binding to the 3'-untranslated region of cathepsin S (CtsS) mRNA, hsa-miR-106b-5p was found to be involved in the posttranscriptional gene regulation of CtsS during mycobacterial infection 181 . Additionally, the involvement of miR-26a-5p was defined upon Mycobacterium tuberculosis infection by targeting the IFNγ signaling cascade 182, 183 . Finally, by targeting STAT3, the involvement of hsa-miR-17-5p in the regulation of tuberculosis-induced autophagy in macrophages was also proven 184 . The experimental design of this study led us to decipher complex miRNA signatures associated with IA by integrating small RNA sequencing and multiple bioinformatics tools. A miRNA::mRNA regulatory network was also constructed to investigate relevant downstream molecular mechanisms of the predicted targeted genes of the captured miRNAs. To our knowledge, this is the first effort to understand the levels of blood-born, circulating miRNAs per IA to identify stable, abundant disease-specific biomarkers. Our results suggest that some DEMs have the potential to serve as good and abundant blood-born biomarkers for IA. Our data may also lead to a better understanding disease pathogenesis and provide insight into the complexity and diversity of small RNA molecules that regulate immunodeficient IA. Regarding its incidence, IA can be considered a rare disorder. Based on epidemiological data on IA the estimated occurrence of IA is 5-13% in HSCT recipients and 10-20% in patients receiving intensive chemotherapy for leukemia [185] [186] [187] . In our study, disease prevalence exceeded 25% which might be explained by the relatively small hemato-oncology population size (HO-proven/probable IA). Due to the imbalance and limited size of the study cohort, this study may be considered exploratory. For a higher level of confidence, differential expression of the miRNome should be studied in an extended cohort by recruiting patients from a more diverse HO population. Therefore, validation of the results in an extended population with a broader range of patients is needed. There is a lack of standardized protocols for miRNA extraction or quality and quantity assessment either. Furthermore, due to the high levels of endogenous ribonuclease activity and low RNA content quantity of circulating miRNAs seem to vary widely between commercially available kits 188 . Because of the poor RNA yield many profiling methods are using total RNA. Furthermore, nanospectrophotometry is highly sensitive for low RNA concentration, resulting to poor quality criteria. It also needs to be considered, that many miRNAs reported as circulating cancer biomarkers reflect a secondary effect on blood cells rather than a tumor cell-specific origin 189 . The fact, that circulating miRNAs are influenced by blood cell counts and hemolysis, establishing a correct and optimal miRNA extraction is crucial for biomarker studies. While of major interest for future biomarker development, this study presents a retrospective evaluation of our patient cohort, and no prospective validation of the identified miRNAs in independent cohorts has been performed. Therefore, for future studies of circulating miRNA biomarkers that are expressed in blood cells, miRNA expression levels should also be interpreted in light of blood cell counts. The most recent advances in the diagnosis of invasive fungal diseases indicate miRNAs. However, the number of patients at risk of IA is increasing globally, and data on disease-specific circulating miRNAs are scant. Microbiological laboratories still struggle to achieve a timely and adequate diagnosis. Numerous scientists tend to identify biomarkers that could help in the early diagnosis of IA. Therefore, the discovery of specific predisposing factors is essential to obtain an accurate diagnosis and a better understanding of disease pathophysiology. As circulating miRNAs are promising biomarkers for various diseases, in this study, we analyzed the small RNA transcriptomes of HO patients and healthy controls through next-generation sequencing to reveal IA-specific miRNA expression patterns. The identification of IA-specific miRNA signatures might also be essential for the elucidation of disease pathophysiology. Library preparation and sequencing. Libraries for small RNA sequencing were prepared using a NEB-Next® Small RNA Library Prep Set for Illumina ® (New England Biolabs Inc., United Kingdom) following the manufacturer's instructions. Two sequencing runs were performed, samples were divided to batches in a random manner and both runs contained samples from all study groups in order to address batch effects ( Supplementary Fig. 4) . Six microliters of 500 ng total RNA was used as the starting material to prepare the libraries. Multiplex adapter ligations (using 3′ and 5′ SR adaptors), reverse transcription primer hybridization, reverse transcription reactions and PCR amplifications were performed as described in the protocol. After PCR amplification, the cDNA constructs were purified with a QIAQuick PCR Purification Kit (28104, Qiagen, Hilden, Germany) and MagSI-NGS PREP Plus beads (MDKT00010075, magtivio BV, The Netherlands) following the modifications suggested in the NEBNext Multiplex Small RNA Library Prep protocol. Size selection of the amplified cDNA constructs was performed using E-Gel® EX 2% Agarose (G401002, Invitrogen by Thermo Fisher Scientific, Israel) with an E-Gel™ Power Snap Electrophoresis Device (G8100, Invitrogen by Thermo Fisher Scientific, Singapore) following the manufacturer's protocol. www.nature.com/scientificreports/ program. We summarized the sequencing quality across samples grouped by batches in order to detect outliers with poor quality (Supplementary Fig. 4 ). Additional trimming was performed with Trimmomatic (4:20 sliding window parameter) 192 . miRNA annotation was performed with miRge 2.0 software 193 . Sequencing reads were divided into two partitions with a target read length threshold of 28 bases. For the lower portion (< 28 bases) annotation reports showed that circa 95% of the reads were assigned miRNAs, while for the upper part of the reads (> 28 bases) no miRNAs were detected. Differential expression analysis was performed with the edgeR R package. Libraries in the program were normalized by trimmed mean of M values (TMM). Volcano plots from the edgeR result were generated using the EnhancedVolcano R package 194 . Statistical comparisons among groups were also checked with nonparametric Kruskal-Wallis test where sequencing read numbers were converted to RPM (reads per million reads) in order to normalize libraries. P values were adjusted with Benjamini-Hochberg method, and P < 0.05 was determined as significant difference. Additionally, clustermap was generated in Python (ver3.6.14) with the seaborn package (0.11.1) 195 , where dendograms were also created with hierarchical agglomerative clustering. Diagnostic performances of the DEMs. The diagnostic values of the preselected miRNA biomarkers were measured by easyROC, a web-based tool for ROC curve analysis 196 . The ROC curve was edited by plotting the true positive rates (sensitivity values on the y-axis) versus the false positive rates (1-specificity values on the x-axis). The area under the ROC curve (AUC) was also calculated and used as an accuracy index to evaluate the diagnostic performances of the selected miRNAs. Target and pathway prediction. miRabel 21 , a miRNA target prediction tool, was used to determine the gene targets of the 7 selected miRNAs. For every miRNA, the top 100 hits were chosen according to the generated miRabel scores. Pathway analysis was carried out with the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [197] [198] [199] . Validation of miR-seq data by qRT-PCR. Total RNA (1.5 ng) was used for miRNA-specific reverse transcription using a TaqMan™ Advanced miRNA cDNA Synthesis Kit (Thermo Fisher Scientific, USA). Quantitative real-time PCR with 62 TaqMan™ Gene Expression Assays (Thermo Fisher Scientific, USA) was performed to detect miRNA expression profiles in 3 independent technical repeats, including negative controls (no template from RNA isolation and reverse transcription), using a LightCycler ® 480 Real-Time PCR System (Roche Diagnostics, Risch-Rotkreuz Switzerland). PCR conditions were as follows: 20 s at 95 °C, 50 cycles of 3 s at 95 °C and 30 s at 60 °C followed by 1 cycle of 3 min at 37 °C. To identify a stable endogenous miRNA control in whole blood samples from healthy controls and study participants, twenty candidate miRNAs were selected by RefFinder 200 . Among the 20 reference miRNAs, hsa-miR-181a-5p was the most stable and used for normalization. addressed because open lung biopsy was performed via postmortem thoracotomy 201 . Histological samples were taken from the major organs according to a standard protocol. Lung sampling was performed from three independent parts of the potentially infiltrated lung parenchyma. Ethical statement. The study protocol was approved by the Ethics Committee of the University Hospitals of Debrecen, Hungary (MK-JA/50/0096-01/2017) and carried out in accordance with the approved guidelines. Informed consent was obtained from the participants in the study. All sequence data used in the analyses were deposited in the Sequence read Archive (SRA) (http:// www. ncbi. nlm. nih. gov/ sra) under PRJNA754268 accession number. Situation Report Tracing the origin of invasive fungal infections Invasive candidiasis Epidemiology and outcomes of hospitalizations with invasive aspergillosis in the United States Immunosuppression trends in solid organ transplantation: The future of individualization, monitoring, and management Laboratory diagnosis of invasive aspergillosis The incidence of invasive aspergillosis among solid organ transplant recipients and implications for prophylaxis in lung transplants Invasive aspergillosis in transplant recipients Invasive aspergillosis in allogeneic stem cell transplant recipients: changes in epidemiology and risk factors Invasive fungal infections after allogeneic peripheral blood stem cell transplantation: Incidence and risk factors in 395 patients Secretory microRNAs as biomarkers of cancer miRNA expression profiles and potential as biomarkers in nontuberculous mycobacterial pulmonary disease Prior infections are associated with increased mortality from subsequent blood-stream infections among patients with hematological malignancies miRNAs as biomarkers in disease: Latest findings regarding their role in diagnosis and prognosis Serum microRNAs are promising novel biomarkers Circulating MicroRNAs in Disease Diagnostics and their Potential Biological Relevance Free circulating mircoRNAs support the diagnosis of invasive aspergillosis in patients with hematologic malignancies and neutropenia Specific and novel microRNAs are regulated as response to fungal infection in human dendritic cells The clinical application of microRNAs in infectious disease High-throughput RNA sequencing analysis of plasma samples reveals circulating microRNA signatures with biomarker potential in dengue disease progression Improving bioinformatics prediction of microRNA targets by ranks aggregation Tmem100, an ALK1 receptor signaling-dependent gene essential for arterial endothelium differentiation and vascular morphogenesis TMEM100 induces cell death in non-small cell lung cancer via the activation of autophagy and apoptosis Distribution of TMEM100 in the mouse and human gastrointestinal tract-A novel marker of enteric nerves TMEM100 negatively regulated by microRNA-106b facilitates cellular apoptosis by suppressing survivin expression in NSCLC Novel roles of TMEM100: Inhibition metastasis and proliferation of hepatocellular carcinoma MicroRNA-106b-5p boosts glioma tumorigensis by targeting multiple tumor suppressor genes Targeting STAT3 in cancer and autoimmune diseases The roles of SOCS3 and STAT3 in bacterial infection and inflammatory diseases Milder clinical hyperimmunoglobulin E syndrome phenotype is associated with partial interleukin-17 deficiency Severely impaired IL-12/IL-18/IFNγ axis in patients with hyper IgE syndrome Regulation of the catalytic activity of the human phosphatase PTPN4 by its PDZ domain Phosphatase PTPN4 preferentially inhibits TRIF-dependent TLR4 pathway by dephosphorylating TRAM The protein tyrosine phosphatase PTPN4/PTP-MEG1, an enzyme capable of dephosphorylating the TCR ITAMs and regulating NF-kappaB, is dispensable for T cell development and/or T cell effector functions Molecular basis of the interaction of the human protein tyrosine phosphatase non-receptor type 4 (PTPN4) with the mitogen-activated protein kinase p38γ* Attenuation of rabies virulence: Takeover by the cytoplasmic domain of its envelope protein Myt1 and Myt1l transcription factors limit proliferation in GBM cells by repressing YAP1 expression Microduplications disrupting the MYT1L Gene (2p25.3) are associated with schizophrenia STAT3 cooperates with phospholipid scramblase 2 to suppress type I interferon response Cellular heterogeneity landscape in laryngeal squamous cell carcinoma Identification of novel hypoxia response genes in human glioma cell line A172 Humanin, a mitochondrial-derived peptide released by astrocytes, prevents synapse loss in hippocampal neurons Predicting the potential ankylosing spondylitis-related genes utilizing bioinformatics approaches C-type lectin MCL is an FcRγ-coupled receptor that mediates the adjuvanticity of mycobacterial cord factor Evolutionarily assembled cis-regulatory module at a human ciliopathy locus Primary cilia in brain development and diseases Metabolism of vertebrate amino sugars with N-glycolyl groups: Elucidating the intracellular fate of the non-human sialic acid N-glycolylneuraminic acid Strategies to interfere with PDZ-mediated interactions in neurons: What we can learn from the rabies virus Regulation of cell adhesion by protein-tyrosine phosphatases: II. CELL-CELL ADHE-SION * PTPN4 negatively regulates CrkI in human cell lines Grb2 and Nck Act cooperatively to promote actin-based motility of vaccinia virus A novel ligand for an SH3 domain of the adaptor protein Nck bears an SH2 domain and nuclear signaling motifs Gene expression alterations associated with oleuropein-induced antiproliferative effects and s-phase cell cycle arrest in triple-negative breast cancer cells Death receptor 6 induces apoptosis not through type I or type II pathways, but via a unique mitochondriadependent pathway by interacting with Bax protein Identification and functional characterization of DR6, a novel death domain-containing TNF receptor Mitochondrial membrane protein Bcl-xL, a regulator of adult neuronal growth and synaptic plasticity: Multiple functions beyond apoptosis Small GTPase RBJ mediates nuclear entrapment of MEK1/MEK2 in tumor progression Homozygous mutation in the eukaryotic translation initiation factor 2alpha phosphatase gene, PPP1R15B, is associated with severe microcephaly, short stature and intellectual disability A missense mutation in PPP1R15B causes a syndrome including diabetes, short stature, and microcephaly Protein phosphatase 1, regulatory subunit 15B is a survival factor for ERα-positive breast cancer Familial tumoral calcinosis and the role of O-glycosylation in the maintenance of phosphate homeostasis Prognostic significance of UDP-N-acetyl-alpha-d-galactosamine:polypeptide N-acetylgalactosaminyltransferase-3 (GalNAc-T3) expression in patients with gastric carcinoma Influenza A virus-induced expression of a GalNAc transferase, GALNT3, via microRNAs is required for enhanced viral replication SNX9 couples actin assembly to phosphoinositide signals and is required for membrane remodeling during endocytosis SNX9 regulates tubular invagination of the plasma membrane through interaction with actin cytoskeleton and dynamin 2 SNX18 shares a redundant role with SNX9 and modulates endocytic trafficking at the plasma membrane SNX9 determines the surface levels of integrin β1 in vascular endothelial cells: Implication in poor prognosis of human colorectal cancers overexpressing SNX9 SNX18 and SNX33 are required for progression through and completion of mitosis SNX9 as an adaptor for linking synaptojanin-1 to the Cdc42 effector ACK1 Structure and plasticity of Endophilin and Sorting Nexin 9 SAR1B GTPase is necessary to protect intestinal cells from disorders of lipid homeostasis, oxidative stress, and inflammation Small GTPases SAR1A and SAR1B regulate the trafficking of the cardiac sodium channel Nav1.5 Rho family GTPase-dependent immunity in plants and animals Teashirt-3, a novel regulator of muscle differentiation, associates with BRG1-associated factor 57 (BAF57) to inhibit myogenin gene expression TSHZ3 functions as a tumor suppressor by DNA methylation in colorectal cancer Basolateral membrane expression of the Kir 2.3 channel is coordinated by PDZ interaction with Lin-7/CASK complex The stardust family protein MPP7 forms a tripartite complex with LIN7 and DLG1 that regulates the stability and localization of DLG1 to cell junctions Characterization of MALS/Velis-1, -2, and -3: A family of mammalian LIN-7 homologs enriched at brain synapses in association with the postsynaptic density-95/NMDA receptor postsynaptic complex HAUS, the 8-subunit human Augmin complex, regulates centrosome and spindle integrity The augmin complex plays a critical role in spindle microtubule generation for mitotic progression and cytokinesis in human cells TAOK3 is a MAP3K contributing to osteoblast differentiation and skeletal mineralization TAO kinases mediate activation of p38 in response to DNA damage Comparative kinome analysis to identify putative colon tumor biomarkers TAOK3 phosphorylates the methylenecyclopropane nucleoside MBX 2168 to its monophosphate Abnormal N-glycosylation pattern for brain nucleotide pyrophosphatase-5 (NPP-5) in Mecp2-mutant murine models of Rett syndrome A key tyrosine substitution restricts nucleotide hydrolysis by the ectoenzyme NPP5 DNAJB6, a key factor in neuronal sensitivity to amyloidogenesis Emerging roles and underlying molecular mechanisms of DNAJB6 in cancer Hsp40 protein DNAJB6 interacts with viral NS3 and inhibits the replication of the Japanese Encephalitis Virus Cell-autonomous and non-cell autonomous protection of DNAJB6 in Huntington's disease MiR-20a-5p mediates hypoxia-induced autophagy by targeting ATG16L1 in ischemic kidney injury The Thr300Ala variant in ATG16L1 is associated with improved survival in human colorectal cancer and enhanced production of type I interferon Heteromerization of angiotensin receptors changes trafficking and arrestin recruitment profiles Effect of luminal angiotensin II receptor antagonists on proximal tubule transport Angiotensin type 2 receptor-mediated apoptosis of human prostate cancer cells The pivotal link between ACE2 deficiency and SARS-CoV-2 infection EB1 and EB3 regulate microtubule minus end organization and Golgi morphology Dynamic microtubules regulate dendritic spine morphology and synaptic plasticity A Rab-GAP TBC domain protein binds hepatitis C virus NS5A and mediates viral replication Role for TBC1D20 and Rab1 in hepatitis C virus replication via interaction with lipid droplet-bound nonstructural protein 5A Protein kinase D regulates basolateral membrane protein exit from trans-Golgi network Angiotensin II directly triggers endothelial exocytosis via protein kinase C-dependent protein kinase D2 activation Identification of protein kinase D2 as a pivotal regulator of endothelial cell proliferation, migration, and angiogenesis Protein kinase D2 regulates chromogranin A secretion in human BON neuroendocrine tumour cells Protein kinase D2 mediates activation of nuclear factor kappaB by Bcr-Abl in Bcr-Abl+ human myeloid leukemia cells The lysosomal GPCR-like protein GPR137B regulates Rag and mTORC1 localization and activity Dynein light intermediate chain 2 Facilitates the metaphase to anaphase transition by inactivating the spindle assembly checkpoint GABAergic differentiation induced by Mash1 is compromised by the bHLH proteins Neurogenin2, NeuroD1, and NeuroD2 RASSF2 is a novel K-Ras-specific effector and potential tumor suppressor The Ras effector RASSF2 is a novel tumor-suppressor gene in human colorectal cancer RASSF2 associates with and stabilizes the proapoptotic kinase MST2 Atad2 is a generalist facilitator of chromatin dynamics in embryonic stem cells ATAD2 predicts poor outcomes in patients with ovarian cancer and is a marker of proliferation The E3 ligase tripartite motif 8 targets TAK1 to promote insulin resistance and steatohepatitis The roles of the TRIM E3-ubiquitin ligase family in innate antiviral immunity TRIM8 negatively regulates TLR3/4-mediated innate immune response by blocking TRIF-TBK1 interaction Interferon-β signal may up-regulate PD-L1 expression through IRF9-dependent and independent pathways in lung cancer cells Hepatitis B viraemia: Its heritability and association with common genetic variation in the interferon gamma signalling pathway Life-threatening influenza pneumonitis in a child with inherited IRF9 deficiency MicroRNA-20a-5p suppresses IL-17 production by targeting OSM and CCL1 in patients with Vogt-Koyanagi-Harada disease Pseudomonas aeruginosa exoenzyme S induces transcriptional expression of proinflammatory cytokines and chemokines MicroRNA-9 controls the expression of Granuphilin/Slp4 and the secretory response of insulin-producing cells Slp4-a/granuphilin-a inhibits dense-core vesicle exocytosis through interaction with the GDP-bound form of Rab27A in PC12 cells First description of a compensatory xylosyltransferase I induction observed after an antifibrotic UDP-treatment of normal human dermal fibroblasts Meta-analysis of continuous phenotypes identifies a gene signature that correlates with copd disease status UDP-glucuronate decarboxylase, a key enzyme in proteoglycan synthesis: Cloning, characterization, and localization* Human UDP-α-d-xylose synthase forms a catalytically important tetramer that has not been observed in crystal structures Upregulation of Lhx8 increase VAChT expression and ACh release in neuronal cell line SHSY5Y The folliculin tumor suppressor is a GAP for the RagC/D GTPases that signal amino acid levels to mTORC1 Ragulator-Rag complex targets mTORC1 to the lysosomal surface and is necessary for its activation by amino acids FYCO1 is a Rab7 effector that binds to LC3 and PI3P to mediate microtubule plus end-directed vesicle transport An activating mutation in STAT3 results in neonatal diabetes through reduced insulin synthesis STAT3 in CD4+ T helper cell differentiation and inflammatory diseases STAT3 and SOCS3 regulate NG2 cell proliferation and differentiation after contusive spinal cord injury HIF-1 α downregulates miR-17/20a directly targeting p21 and STAT3: a role in myeloid leukemic cell differentiation STAT3 is required for MiR-17-5p-mediated sensitization to chemotherapy-induced apoptosis in breast cancer cells H19 promotes non-small-cell lung cancer (NSCLC) development through STAT3 signaling via sponging miR-17 Upregulation of survivin by leptin/STAT3 signaling in MCF-7 cells Expression and distribution of distinct variants of E-MAP-115 during proliferation and differentiation of human intestinal epithelial cells Differentiation of human colon cancer cells changes the expression of beta-tubulin isotypes and MAPs MAP7 interacts with RC3H1 and cooperatively regulate cell-cycle progression of cervical cancer cells via activating the NF-κB signaling WAC, a functional partner of RNF20/40, regulates histone H2B ubiquitination and gene transcription WAC/RET: A novel RET oncogenic fusion variant in non-small cell lung carcinoma Chondroitin sulfate metabolism in the brain Sulfated glycosaminoglycans: Their distinct roles in stem cell biology Identification of phosphatase that dephosphorylates xylose in the glycosaminoglycan-protein linkage region of proteoglycans Cep97 and CP110 suppress a cilia assembly program Primary structure and expression of a novel human laminin alpha 4 chain Laminin-α4 and integrin-linked kinase mutations cause human cardiomyopathy via simultaneous defects in cardiomyocytes and endothelial cells Cloning and characterization of the human gene RAP2C, a novel member of Ras family, which activates transcriptional activities of SRE G1-phase cell-cycle regulation MicroRNA-188-5p promotes apoptosis and inhibits cell proliferation of breast cancer cells via the MAPK signaling pathway by targeting Rap2c Ras-related protein Rap2c promotes the migration and invasion of human osteosarcoma cells The Rap2c GTPase facilitates B cell receptor-induced reorientation of the microtubule-organizing center Regulation of clock and NPAS2 DNA binding by the redox state of NAD cofactors The circadian gene NPAS2, a putative tumor suppressor, is involved in DNA damage response ARID4B is critical for mouse embryonic stem cell differentiation towards mesoderm and endoderm, linking epigenetics to pluripotency exit Arid4b alters cell cycle and cell death dynamics during mouse embryonic stem cell differentiation Overexpression of ARID4B predicts poor survival in patients with hepatocellular carcinoma Analysis of the PD-1 ligands among gastrointestinal cancer patients Association of apoptosis genes in PDCD1 but not PDCD1LG2, FAS, and FASLG with pediatric idiopathic uveitis in Han Chinese PFKP signaling at a glance: An emerging mediator of cancer cell metabolism HOTAIRM1 knockdown enhances cytarabine-induced cytotoxicity by suppression of glycolysis through the Wnt/β-catenin/PFKP pathway in acute myeloid leukemia cells Human USP3 is a chromatin modifier required for S phase progression and genome stability JAK1 as a prognostic marker and its correlation with immune infiltrates in breast cancer Decreased expression of JAK1 associated with immune infiltration and poor prognosis in lung adenocarcinoma A tumor-associated mutation of FYVE-CENT prevents its interaction with beclin 1 and interferes with cytokinesis Effect of tumor-associated mutant E-cadherin variants with defects in exons 8 or 9 on matrix metalloproteinase 3 Matrix metalloproteinases and TGFβ1 modulate oral tumor cell matrix Computational and experimental studies on human misshapen/NIK-related kinase MINK-1 Misshapen/NIK-related kinase (MINK1) is involved in platelet function, hemostasis, and thrombus formation Suppression of Th17 cell differentiation by misshapen/NIK-related kinase MINK1 Why are so many cases of invasive aspergillosis missed Identification of potential microRNA panels for pancreatic cancer diagnosis using microarray datasets and bioinformatics methods The value of circulating microRNAs for early diagnosis of B-cell lymphoma: A case-control study on historical samples MicroRNAs and noncoding RNAs in hematological malignancies: Molecular, clinical and therapeutic implications Evidence for the biogenesis of more than 1000 novel human microRNAs Aspergillus fumigatus infection in humans with STAT3-deficiency is associated with defective interferon-gamma and Th17 responses Anti-Aspergillus human host defence relies on type 1 T helper (Th1), rather than type 17 T helper (Th17), cellular immunity Pulmonary immune responses against Aspergillus fumigatus are characterized by high frequencies of IL-17 producing T-cells Mycobacterium tuberculosis modulates miR-106b-5p to control cathepsin s expression resulting in higher pathogen survival and poor T-cell activation Tuberculosis-associated microRNAs: From pathogenesis to disease biomarkers The role of microRNAs and long non-coding RNAs in the Regulation of the immune response to mycobacterium tuberculosis infection MicroRNA 17-5p regulates autophagy in Mycobacterium tuberculosis-infected macrophages by targeting Mcl-1 and STAT3 Prospective surveillance for invasive fungal infections in hematopoietic stem cell transplant recipients Epidemiology and outcome of invasive fungal infections in solid organ transplant recipients Invasive fungal infections among organ transplant recipients: Results of the transplant-associated infection surveillance network (TRANSNET) Influence of plasma processing on recovery and analysis of circulating nucleic acids Blood cell origin of circulating microRNAs: A cautionary note for cancer biomarker studies Revised definitions of invasive fungal disease from the European Organization for Research and Treatment of Cancer/Invasive Fungal Infections Cooperative Group and the National Institute of Allergy and Infectious Diseases Mycoses Study Group (EORTC/MSG) Consensus Group Cutadapt removes adapter sequences from high-throughput sequencing reads FastQ Screen: A tool for multi-genome mapping and quality control miRge 2.0 for comprehensive analysis of microRNA sequencing data edgeR: A Bioconductor package for differential expression analysis of digital gene expression data seaborn: Statistical data visualization OptimalCutpoints: An R package for selecting optimal cutpoints in diagnostic tests Kyoto encyclopedia of genes and genomes KEGG: Integrating viruses and cellular organisms Toward understanding the origin and evolution of cellular organisms miRDeepFinder: A miRNA analysis tool for deep sequencing of plant small RNAs Trends in the postmortem epidemiology of invasive fungal infections at a university hospital We thank the conscientious work to the hospital workers, nurses at the Hematology/Oncology unit of Internal Medicine of University Hospitals Debrecen and to the laboratory-and administrative staff of Virology from the Medical Microbiology Department and Aspergillosis Diagnostics laboratory at the Department of Human Genetics respectively. M.P. reviewed and interpreted the results and drafted the manuscript. M.P., G.F. designed the study plan and coordinated the study protocol. L.R., R.SZ. recruited the patients and performed the EORTC/MSG categorization. A.A.R., P.D., E.T. and SZ.P. participated in the total RNA isolations, small RNA sequencing, qRT-PCR analyses. G.F. performed the data analyses and prepared the figures. S.B. reviewed and conducted critical revision of the manuscript for important intellectual content. All authors read and approved the final manuscript. Open access funding provided by University of Debrecen. The authors declare no competing interests. The online version contains supplementary material available at https:// doi. org/ 10. 1038/ s41598-022-11239-z.Correspondence and requests for materials should be addressed to M.P.Reprints and permissions information is available at www.nature.com/reprints.Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.