key: cord-0755311-2t6ucw3v authors: Perrone, Rosalba; Lavezzo, Enrico; Riello, Erika; Manganelli, Riccardo; Palù, Giorgio; Toppo, Stefano; Provvedi, Roberta; Richter, Sara N. title: Mapping and characterization of G-quadruplexes in Mycobacterium tuberculosis gene promoter regions date: 2017-07-18 journal: Sci Rep DOI: 10.1038/s41598-017-05867-z sha: 83f6440ed5fdcf297d2728df93830c842076c33c doc_id: 755311 cord_uid: 2t6ucw3v Mycobacterium tuberculosis is the causative agent of tuberculosis (TB), one of the top 10 causes of death worldwide in 2015. The recent emergence of strains resistant to all current drugs urges the development of compounds with new mechanisms of action. G-quadruplexes are nucleic acids secondary structures that may form in G-rich regions to epigenetically regulate cellular functions. Here we implemented a computational tool to scan the presence of putative G-quadruplex forming sequences in the genome of Mycobacterium tuberculosis and analyse their association to transcription start sites. We found that the most stable G-quadruplexes were in the promoter region of genes belonging to definite functional categories. Actual G-quadruplex folding of four selected sequences was assessed by biophysical and biomolecular techniques: all molecules formed stable G-quadruplexes, which were further stabilized by two G-quadruplex ligands. These compounds inhibited Mycobacterium tuberculosis growth with minimal inhibitory concentrations in the low micromolar range. These data support formation of Mycobacterium tuberculosis G-quadruplexes in vivo and their potential regulation of gene transcription, and prompt the use of G4 ligands to develop original antitubercular agents. development of G4 specific antibodies 14, 15 . In viruses G4s have been implicated in key steps 16 : in the human immunodeficiency virus, the presence of functionally significant G4s 10, 13, [17] [18] [19] and their targeting by G4 ligands with consequent antiviral effects 10, 20, 21 have been reported. G4s have been also discovered in herpesviruses [22] [23] [24] [25] , SARS coronavirus 26 and human papilloma, Zika, Ebola and hepatitis C virus genomes [27] [28] [29] [30] . In prokaryotes, G4 sequences have been reported in Escherichia coli 7, 31, 32 , Deinococcus radiodurans [33] [34] [35] Xanthomonas and Nostoc sp 36 . Evidence of bacterial enzymes that process G4s, such as Pif1 and RecQ helicases, has been provided in Escherichia coli, Clostridium difficile and Bacteroides sp [37] [38] [39] [40] [41] [42] . Bacterial G4s have been implicated in antigenic variation of the cell-surface pilin proteins of Neisseria gonorrhoeae [43] [44] [45] [46] . In Mtb, whose genome is 65% GC rich, previous bioinformatics analysis identified more than 10,000 motifs with the potential to fold into G4 structures 32 . Additionally, evidence for the presence of a specific helicase that targets G4s (DinG) and for a G4 aptamer that inhibits a polyphosphate kinase involved in the inorganic polyphosphate intracellular metabolism has been provided in Mtb 47, 48 . The involvement of G4 structures in several human diseases propelled the development of small molecules directed against G4s 9 . Aromatic cores with protonable side chains, such as the acridine, BRACO-19 49, 50 and water-soluble naphthalene diimides (NDIs) 21, [51] [52] [53] [54] [55] [56] , specifically bind the G4 conformation. So far, the vast majority of molecules has been tested against cellular G4s implicated in tumor pathogenesis: some compounds showed interesting antiproliferative properties 57 ; in particular, quarfloxin proceeded into phase II clinical trials, but its limited bioavailability prevented further progress 58 . In bacteria, N-methyl mesoporphyrin has been shown to attenuate Deinococcus resistance to radiation 33 ; to our knowledge no other G4 ligand has been so far tested in bacteria. To search for G4 motifs in Mtb, we have implemented a tool able to scan the whole genome and rank potentially interesting G4s according to their score. Only high scoring hits close to known transcription start sites (TSS) were considered. Four G4 sequences, close to the TSS of genes with known function, were selected and their G4 folding confirmed in solution. Two G4 ligands stabilized the selected G4s and inhibited bacterial cells growth with minimal inhibitory concentrations (MIC) in the low micromolar range. Identification of putative G4 motifs in the promoter region of Mtb genes. To detect the presence of putative G4 motifs, the Mtb genome was scrutinized in silico assessing various lengths of G-islands and loops ( Supplementary Figures 1a and b) . A G4 was reported when at least four consecutive G-islands (n = 4) were identified. We also defined two parameters, l and d, corresponding to the minimal length of a G4 homopolymeric G-island and the maximum allowed distance between consecutive G-islands, respectively. Different combinations of l and d parameters were applied to allow the detection of G4 motifs with increasing stringency (i.e. 2 ≤ l ≤ 5 and d = 7, 11, and 15); we chose G4s with loop length up to 15 nucleotides since it has been reported that they can fold into stable G4s 59 . Computational searches have detected a high concentration of G4 motifs near promoter regions both in eukaryotic and prokaryotic genomes and in some cases a possible role of G4 motifs in transcription regulation has been reported 60 . For this reason and because of the abundance of GC content in Mtb, we restricted G4 analysis to regions close to transcription start sites (TSS). A short and a long score were computed considering 15 and 50 nucleotides, respectively, both upstream and downstream of the G4 motif, according to Beaudoin et al. 61 ( Table 1) . The genomic coordinates of the predicted G4s both in the forward (Supplementary File S1a) and in the reverse strand (Supplementary File S1b) were intersected with the putative gene promoters, inferred by considering 50 nt upstream of the known primary TSS 62 (Table 1 and Supplementary File S2 "Primary TSS"). The G4 motifs overlapping promoter regions were ranked by the short and long scores (Supplementary File S2 "G4 overlapping promoters"). As expected, the amount of detected G4 motifs decreased with the stringency of the searching parameters (i.e. longer G-islands and shorter distance between them). Moreover, the distribution of the predicted G4s was homogeneous in the two strands of the genome, with a slight prevalence of the reverse strand in six categories (out of 12) as opposed to four categories, which were more abundant in the forward strand (Table 1) . To note that both the forward and reverse strand, depending on the gene, can be the coding strand in transcription. Genes with putative DNA G4 forming sequences in Mtb. Based on the described bioinformatics analysis, we identified 45 genes with a putative G4, upstream or overlapping their TSS, with at least 3 Gs in each island (therefore with the ability to form at least a three-stacked G4) and a short or long score ≥ 2 ( Table 2 and Supplementary File S2 "Candidate genes"). This threshold was chosen according to Beaudoin et al. 61 , which did not validate G4s with lower score. These genes were classified according to their functional category as reported in TubercuList 63 . In addition, a de novo function prediction based on Gene Ontology (GO) annotations was performed with the online server Argot2.5 64 to expand already available annotations and potentially define functions for those genes that are still hypothetical/unknown (Supplementary File S2 'Function prediction'). Globally, 35 genes out of 45 were annotated with at least one GO term: 8 of them had been previously unannotated, while the others were confirmed or expanded (Supplementary File S2 "Candidate genes"). We found that most G4s were distributed among the following functional categories: "cell wall and cell processes", "intermediary metabolism and respiration", "regulatory proteins", and "conserved hypotheticals" (i.e. conserved proteins with no confirmed known function). Among the identified putative G4s, the sequence upstream rv0166 (fadD5) (Supplementary File S2 "Candidate genes") had been previously reported by Thakur and colleagues to fold into a G4 structure 47 . The same authors reported two additional genes to display a G4 motif; these genes are not present in our analysis since they are not associated to reported TSS 62 . Scientific RepoRts | 7: 5743 | DOI:10.1038/s41598-017-05867-z Selected G-rich sequences in the Mtb genome fold into G4. Among the genes with a predicted G4 in their promoter region, we selected four candidates for further experimental validations, namely Glucose-6-phosphate dehydrogenase 1 (zwf1), ATP-dependent Clp protease (clpx), Oxidation-sensing Regulator Transcription Factor (mosR), and membrane NADH dehydrogenase (ndhA) ( Table 2 ). The choice fell on putative G4s belonging to the most stable categories (at least three 'Gs' in each island and loops no longer than 11 nt), prioritizing those present in multiple categories (for instance zwf1 has a G4 that falls both in the 3_4_7 and 3_4_11 category) with at least one score > 2 and in the promoter of genes with a known function. G4 folding and topology was initially assessed by circular dichroism (CD) spectroscopy in the absence or presence of increasing concentrations of K + , since this monovalent cation is reported to stabilize the G4 conformation. All the selected molecules in the presence of K + displayed the G4 CD signature ( Fig. 1a-d) . The zwf1 G4 structure exhibited a mixed-type conformation in K + , with a shoulder at 265 nm, a positive and a negative peak at 290 nm and 240 nm, respectively (Fig. 1a) . clpx G4 adopted a parallel-like conformation in K + , with a maximum at 265 nm and a minimum at 240 nm ( Fig. 1b) . mosR G4 folded in a mixed type conformation in K + showing a spectrum with two positive peaks (267 and 290 nm) and a negative peak at 240 nm (Fig. 1c) . Molar ellipticity values of all these structures increased in a K + -dependent manner, further supporting G4 formation ( Fig. 1a-c) . zwf1 and mosR displayed a G4-like CD spectrum (mixed-type conformation) also in the absence of K + , indicating high propensity to fold and stability. The ndhA G4 sequence transitioned from mixed-type in the absence of K + to fully antiparallel (CD spectrum with two maxima at 240 and 290 nm and a minimum at 265 nm) in the presence of K + 150 mM (Fig. 1d ). Overall our data indicate that the selected sequences of Mtb can effectively fold into G4 conformations. Stability of zwf1, clpx, mosR and ndhA G4s in the absence and presence of increasing K + concentrations (50-150 mM) was assessed by melting experiments monitored by CD, calculating the melting temperatures (T m ) according to the van't Hoff equation (Table 3 ). In all cases the CD signal decreased over temperature. For zwf1, clpx and mosR G4s a single transition between 20 °C and 90 °C was appreciable, leading to discrete T m values. ndhA G4 showed a peculiar behaviour, with a relatively high T m (60.5 ± 0.3 °C) in the absence of K + and two different T m values in the presence of K + ascribable to two transitions due to the presence of spectroscopically distinct species in solution. Overall we observed increase of T m values in a K + -dependent manner, indicating that G4s were stabilized by K + with increase of T m up to 34.1 °C ( Table 3) . Effect of G4 ligands on Mtb G4s. We next investigated Mtb G4 sequences in the presence of G4 ligands that have been reported to specifically recognize and stabilize G4 structures over double-and single-stranded nucleic acids. In particular, we tested a commercially available G4 ligand, BRACO-19 65 , and a newly synthesized compound, c-exNDI 2 21 Position of the found G4s in the Mtb genome is available in Supplementary Files S1a and S1b. of the two G4 ligands on the selected sequences in the presence of 100 mM K + was initially assessed by CD analysis: they induced mild conformational changes in Mtb G4s without affecting the main topology, which remained characteristic of the G4 conformation (Fig. 2 ). Table 2 . G4 sequences upstream or overlapping TSS in the Mtb genome, forming G4s with at least three stacked tetrads (at least 3 Gs in each G-rich island) and with short or long score ≥ 2. G tracts with at least three Gs are shown in bold. GG tracts are underlined since they may aid G4 folding. Tracts with the potential to form a bulged G4 (i.e. GXGG, where X is any of the three remaining bases) are additionally shown in italics. The symbol^ indicates genes, the corresponding G4 sequences of which were chosen for further investigation. Rv number is the gene numeration in the considered reference strain H37Rv. a Position of the last nt of the G4 motif with respect to the TSS. Asterisks indicate that the reported G4 sequence is in the reverse strand. G4 ligand-induced stabilization was assessed by CD thermal unfolding analysis. G4 ligands were able to highly stabilize Mtb G4s with T m values in some cases higher than 90 °C (Table 3 ). In cases where several transitions were observed ( Supplementary Figures 2 and 3) , T m values for each transition were reported (Table 3) . zwf1 G4 was the most efficiently stabilized sequence with an increase of T m higher than 41.5 °C in the presence of both BRACO-19 and c-exNDI 2 ( Table 3) . G4 folding of zwf1, clpX, mosR and ndhA sequences in the absence/presence of G4 ligands was additionally tested by the Taq polymerase stop assay (Fig. 3) . This technique allowes to evaluate G4 formation in a DNA template and G4 involvement in arresting the Taq polymerase processing. This G4-specific block can be then accurately solved in a denaturing polyacrylamide gel in terms of intensity and position in the sequence. For this purpose, the zwf1, clpX, mosR and ndhA oligonucleotides were added of a primer annealing region at their 3′-end. Moreover, additional T-flanking bases at both 5′-and 3′-ends were added to separate the 3′-end of the primer and the first G of the G4 portion. Samples were incubated in the absence or presence of 100 mM KCl (Fig. 3a, lanes 1 and 2, respectively) , and with 200 nM BRACO-19 or 100 nM c-exNDI 2 (Fig. 3a, lanes 3 and 4, respectively) . A control template unable to fold into G4 was also used to exclude unspecific inhibition of the polymerase enzyme by the G4 ligands. Taq polymerase was tested at 47 °C on all DNA templates. In the presence of all Mtb G4 templates, G4 ligands blocked enzyme processing (Fig. 3a,*, ¤, § and # symbols in lanes 3-4) . Stop sites resulted specific and located at or just before the first 5′ G-tract involved in G4 folding (Fig. 3b) . No stop site was detected on the negative control template (Fig. 3a) . Quantitative analysis of G4 stop bands showed increased G4 formation in the presence of G4 ligands for all G4-forming sequences (Fig. 3c) . Taken together these data indicate that the tested G4 binders strongly recognize and stabilize Mtb G4 sequences. Effect of G4 ligands on Mtb growth. The effect of BRACO-19 and c-exNDI 2 on Mtb growth was analyzed using a REsazurine Microplate Assay (REMA). As shown in Fig. 4 , both compounds were able to inhibit bacterial cell growth with minimal inhibitory concentrations (MIC 80 ) in the micromolar range; c-exNDI 2 was 10 times more potent than BRACO-19 with an MIC 80 of 1.25 μM vs 12.5 μM. The increased potency of c-exNDI 2 may be at least in part due to its higher efficiency in stabilizing Mtb G4s (Table 3) . However, the intracellular concentration reached by these compounds under the investigated conditions is not known. Interestingly, at least for BRACO-19, the MIC 80 was lower than the toxic concentration for eukaryotic cells 20 supporting the possibility to use G4 ligands to develop new antitubercular agents. Among the identified putative G4s in the Mtb genome, we selected 45 of them which were localized upstream of confirmed TSS and formed by at least 3 Gs in each island. The genes with predicted G4s in their TSS were distributed in several functional gene categories. Four putative G4s were selected for further characterization: we showed that all of them actually folded and were stabilized by two G4 ligands. Interestingly, the two ligands were able to inhibit Mtb growth in vivo. Our data support the possibility of Mtb G4 formation in vivo and their role as potential modulators of gene expression. Finally, our data suggest the possibility to use G4s as novel targets to develop antitubercular agents with a new mechanism of action. Bioinformatics prediction of putative G4 motifs in the Mtb genome. An algorithm for the detection of putative G4 motifs was developed in house using Perl programing language and was applied to the reference genome of Mtb H37Rv (NC_000962.3) . First, all guanine homopolymers (G-islands) were identified through pattern matching with the following line of code: (equation I) where seq is the complete genome of Mtb and l is the minimum length required for the homopolymer. A putative G4 was reported when at least four G-islands were detected and the distance between consecutive homopolymers (loop region) was less than or equal to an additional parameter d (distance). G4s in the reverse strand were searched considering cytosines (C) in the same reference sequence. In order to rank the identified G4s and focus only on those with the highest folding probability, we implemented a score measure as reported by Beaudoin et al. 61 . This score evaluates the presence and the relative Table 3 . Melting temperatures (T m ) of Mtb G4 oligonucleotides (4 µM) in the absence and presence of increasing KCl concentrations (50-150 mM) and G4 ligands (16 µM) . When more than one G4 species were observed in the CD spectrum (i.e. I, II), T m values for each species were reported. B19 and NDI stand for the G4 ligands BRACO-19 and c-exNDI 2, respectively. where 'Gs(i)' is the set of substrings of consecutive 'Gs' found in the string s, and |Gs(i)| is the cardinality of the set. A short and a long score were calculated, considering the G4 regions 15 or 50 nucleotides upstream and downstream. The genomic coordinates of the predicted G4s were then intersected with promoter regions. To this aim, the list of primary TSS 62 was exploited to extract putative promoters, which were considered embedded in the 50 nts upstream of each TSS (downstream for TSS in the reverse strand). A G4 was deemed associated to a TSS when at least one nucleotide of the G4 overlapped with the promoter. A list of all potential G4s associated to promoters is provided in Supplementary File S1. Table S1 ). BRACO-19 was from ENDOTHERM, (Saarbruecken, Germany), c-exNDI-2 was synthetized by Dr. Filippo Doria and Prof. Mauro Freccero (University of Pavia). For CD analysis, all DNA oligonucleotides were diluted to a final concentration of 4 μM in lithium cacodylate buffer (10 mM, pH 7.4) and, where appropriate, KCl (50-150 mM). After annealing (95 °C for 5 min), all samples were gradually cooled to room temperature and compounds added from stock at final concentration of 16 µM. CD spectra were recorded on a ChirascanTM-Plus (Applied Photophysisics, Leatherhead, UK) equipped with a Peltier temperature controller using a quartz cell of 5-mm optical path length and an instrument scanning speed of 50 nm/min over a wavelength range of 230-320 nm. The reported spectrum of each sample, representing the average of 2 scans, is baseline-corrected for signal contributions due to the buffer. Observed ellipticities were converted to mean residue ellipticity (θ) = deg × cm2 × dmol−1 (mol. ellip.). For the determination of T m , spectra were recorded over a temperature range of 20-90 °C, with temperature increase of Taq polymerase stop assay. Taq polymerase stop assay was carried out as previously described 10 . Briefly, the 5′-end labelled primer was annealed to its template (Supplementary Table S1) in lithium cacodylate buffer in the presence or absence of KCl 100 mM and by heating at 95 °C for 5 min and gradually cooling to room temperature. Where specified, samples were incubated with BRACO-19 (250 nM) or c-exNDI-2 (100 nM). Primer extension was conducted with 2 U of AmpliTaq Gold DNA polymerase (Applied Biosystem, Carlsbad, California, USA) at 47 °C for 30 min. Reactions were stopped by ethanol precipitation; primer extension products were separated on a 16% denaturing gel, and finally visualized by phosphorimaging (Typhoon FLA 9000). Mtb strains and growth conditions. Mtb strain H37Rv was grown at 37 °C in Middlebrook 7H9 containing 0.5% glycerol and supplemented with 10% bovine serum albumin (BSA) -D-dextrose -NaCl (ADN), 0.05% Tween 80. Middlebrook 7H10 medium supplemented with ADN and glycerol was used as solid medium. REsazurine Microtiter Assay (REMA). Drug sensitivity was determined using REMA as previously described 66 . Briefly, frozen stock cultures were grown on solid medium 7H10/ADN. Subsequently, a pre-culture was carried out in 2 ml of liquid medium (7H9/ADN) starting from an OD 540 of 0.05. Cultures were then grown up to mid-exponential phase (OD 540 0.6-0.8) and then diluted to an OD 540 of 0.01. Microplates suitable for fluorescence reading (96-well FluoroNunc TM black flat bottom plates) were used to determine the MIC of each bacterial strain. Serial dilutions were used to dispense the correct amount of each compound in each well. Each well was than inoculated with a bacterial suspension containing 5 × 10 4 cfu. The plates thus obtained were sealed and incubated for 1 week at 37 °C. After incubation, 10 µl (10% of final volume) of Alamar-Blue (Invitrogen) was added to each well and the plates, after another day of incubation at 37 °C, were read on a microplate reader (Tecan Infinite 200 Pro) to determine the relative fluorescence (excitation 535 nm and emission 590 nm). For each strain we used a positive control (cells without antibiotic) to determine the maximum fluorescence that could be obtained, and a negative control (medium plus antibiotic without cells). Latent Mycobacterium tuberculosis infection Multidrug and extensively drug-resistant tuberculosis G-quadruplex structures: in vivo evidence and function A sodium-potassium switch in the formation of four-stranded G4-DNA Distance-dependent duplex DNA destabilization proximal to G-quadruplex/imotif sequences G-quadruplexes and their regulatory roles in biology A matter of location: influence of G-quadruplexes on Escherichia coli gene expression Topology of a G-quadruplex DNA formed by C9orf72 hexanucleotide repeats associated with ALS and FTD Maizels, N. G4-associated human diseases A dynamic G-quadruplex region regulates the HIV-1 long terminal repeat promoter Characterization of DNA G-quadruplex species forming from C9ORF72 G4C2-expanded repeats associated with amyotrophic lateral sclerosis and frontotemporal lobar degeneration Biological Function and Medicinal Research Significance of G-Quadruplex Interactive Proteins Nucleolin stabilizes G-quadruplex structures folded by the LTR promoter and silences HIV-1 viral transcription Quantitative visualization of DNA G-quadruplex structures in human cells Detection of G-quadruplex DNA in mammalian cells G-quadruplexes in viruses: function and potential therapeutic applications Topology of a DNA G-quadruplex structure formed in the HIV-1 promoter: a potential target for anti-HIV drug development Formation of a unique cluster of G-quadruplex structures in the HIV-1 Nef coding region: implications for antiviral activity U3 region in the HIV-1 genome adopts a G-quadruplex structure in its RNA and DNA sequence Anti-HIV-1 activity of the G-quadruplex ligand BRACO-19 Binding and Antiviral Properties of Potent Core-Extended Naphthalene Diimides Targeting the HIV-1 Long Terminal Repeat Promoter G-Quadruplexes The Herpes Simplex Virus-1 genome contains multiple clusters of repeated G-quadruplex: Implications for the antiviral activity of a G-quadruplex ligand Visualization of DNA G-quadruplexes in herpes simplex virus 1-infected cells G-quadruplexes regulate Epstein-Barr virus-encoded nuclear antigen 1 mRNA translation Role for G-quadruplex RNA binding by Epstein-Barr virus nuclear antigen 1 in DNA replication and metaphase chromosome attachment The SARS-unique domain (SUD) of SARS coronavirus contains two macrodomains that bind G-quadruplexes Zika Virus Genomic RNA Possesses Conserved G-Quadruplexes Characteristic of the Flaviviridae Family Human papillomavirus G-quadruplexes A highly conserved G-rich consensus sequence in hepatitis C virus core gene represents a new anti-hepatitis C target. Science advances 2, e1501535 Chemical Targeting of a G-Quadruplex RNA in the Ebola Virus L Gene Suppression of gene expression by G-quadruplexes in open reading frames depends on G-quadruplex stability Genome-wide prediction of G4 DNA as regulatory motifs: role in Escherichia coli global regulation Genome-wide study predicts promoter-G4 DNA motifs regulate selective functions in bacteria: radioresistance of D. radiodurans involves G4 DNA-mediated regulation G-quadruplex forming structural motifs in the genome of Deinococcus radiodurans and their regulatory roles in promoter functions Topoisomerase IB of Deinococcus radiodurans resolves guanine quadruplex DNA structures in vitro Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp G-quadruplex recognition activities of E. Coli MutS Identification of non-telomeric G4-DNA binding proteins in human, E. coli, yeast, and Arabidopsis The Bacteroides sp. 3_1_23 Pif1 protein is a multifunctional helicase RecG helicase activity at three-and four-strand DNA structures Clostridium difficile TcdC protein binds four-stranded G-quadruplex structures Substrate-specific inhibition of RecQ helicase Neisseria gonorrhoeae RecQ helicase HRDC domains are essential for efficient binding and unwinding of the pilE guanine quartet structure required for pilin antigenic variation An alternative DNA structure is necessary for pilin antigenic variation in Neisseria gonorrhoeae G-quadruplexes in pathogens: a common route to virulence control? PLoS pathogens 11 RecA-binding pilE G4 sequence essential for pilin antigenic variation forms monomeric and 5′ end-stacked dimeric parallel G-quadruplexes Mycobacterium tuberculosis DinG is a structure-specific helicase that unwinds G4 DNA: implications for targeting G4 DNA as a novel therapeutic approach Aptamer-mediated inhibition of Mycobacterium tuberculosis polyphosphate kinase 2 Structural basis of DNA quadruplex recognition by an acridine drug Biological activity of the G-quadruplex ligand RHPS4 (3,11-difluoro-6,8,13-trimethyl-8H-quino[4,3,2-kl] acridinium methosulfate) is associated with telomere capping alteration Structural basis for telomeric G-quadruplex targeting by naphthalene diimide ligands Quinone methides tethered to naphthalene diimides as selective G-quadruplex alkylating agents Water soluble extended naphthalene diimides as pH fluorescent sensors and G-quadruplex ligands Structure-based design and evaluation of naphthalene diimide G-quadruplex ligands as telomere targeting agents in pancreatic cancer cells A photoreactive G-quadruplex ligand triggered by green light Targeting of RET oncogene by naphthalene diimide-mediated gene promoter G-quadruplex stabilization exerts anti-tumor activity in oncogene-addicted human medullary thyroid cancer Targeting G-quadruplexes in gene promoters: a novel anticancer strategy? How long is too long? Effects of loop size on G-quadruplex stability DNA secondary structures: stability and function of G-quadruplex structures New scoring system to identify RNA G-quadruplex folding Genome-wide Mapping of Transcriptional Start Sites Defines an Extensive Leaderless Transcriptome in Mycobacterium tuberculosis TubercuList-10 years after Enhancing protein function prediction with taxonomic constraints -The Argot2.5 web server Trisubstituted acridine derivatives as potent and selective telomerase inhibitors Resazurin microtiter assay plate: simple and inexpensive method for detection of drug resistance in Mycobacterium tuberculosis R.P. performed the spectroscopic and Taq polymerase stop assay and wrote the manuscript, E.L. implemented the bioinformatics analysis and wrote the manuscript, E.R. performed REsazurine Microplate Assay (REMA), R.M. conceived of the work and wrote the manuscript, G.P. commented on the manuscript, S.T. developed the bioinformatics pipeline, coordinated the analysis and wrote the manuscript, R.P. performed REMA, coordinated the analysis and wrote the manuscript, S.N.R. conceived of the work, coordinated the analysis and wrote the manuscript. All authors analyzed the data and reviewed the manuscript. Supplementary information accompanies this paper at doi:10.1038/s41598-017-05867-z Competing Interests: The authors declare that they have no competing interests.Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.