key: cord-0263685-dty18esg authors: Zhang, Rongxin; Ke, Xiao; Gu, Yu; Liu, Hongde; Sun, Xiao title: Whole genome identification of potential G-quadruplexes and analysis of the G-quadruplex binding domain for SARS-CoV-2 date: 2020-06-05 journal: bioRxiv DOI: 10.1101/2020.06.05.135749 sha: 0cbbef10a8e38540eaacd97f1222145b9d61fe9a doc_id: 263685 cord_uid: dty18esg The Coronavirus Disease 2019 (COVID-19) pandemic caused by SARS-CoV-2 (Severe Acute Respiratory Syndrome Coronavirus 2) quickly become a global public health emergency. G-quadruplex, one of the non-canonical secondary structures, has shown potential antiviral values. However, little is known about G-quadruplexes on the emerging SARS-CoV-2. Herein, we characterized the potential G-quadruplexes both in the positive and negative-sense viral stands. The identified potential G-quadruplexes exhibits similar features to the G-quadruplexes detected in the human transcriptome. Within some bat and pangolin related beta coronaviruses, the G-quartets rather than the loops are under heightened selective constraints. We also found that the SUD-like sequence is retained in the SARS-CoV-2 genome, while some other coronaviruses that can infect humans are depleted. Further analysis revealed that the SARS-CoV-2 SUD-like sequence is almost conserved among 16,466 SARS-CoV-2 samples. And the SARS-CoV-2 SUDcore-like dimer displayed similar electrostatic potential pattern to the SUD dimer. Considering the potential value of G-quadruplexes to serve as targets in antiviral strategy, we hope our fundamental research could provide new insights for the SARS-CoV-2 drug discovery. To get the potential G-quadruplexes in the SARS-CoV-2 genome, we took the strategy described as follows ( Fig. 2A) : (i) Predicting the PG4s with three software independently. (ii) Merging the prediction results of the PG4s and evaluating the G-quadruplex folding capabilities by the cG/cC scores. (iii) The PG4s with cG/cC scores higher than the threshold were selected as candidates for further analysis. Here, the threshold for determining whether PG4s can be folded was set to 2.05, as described in the study of Jean-Denis Beaudoin et al. [54] In total, we obtained 24 PG4s (Table. 1) in the positive or negative-sense strands for further analysis. To annotate the PG4s, the reference annotation data (in gff3 format) of SARD-CoV-2 were downloaded from the NCBI database with the accession number of NC_045512. Firstly, we focused on the PG4s on the positive-sense strand. Fifteen of the 24 PG4s (67.5%) were located on the positively-sense strand, the vast majority of them were harbored in non-structural proteins including nsp1, nsp3, nsp4, nsp5, nsp10 and nsp14, with the remaining ones located in the spike protein, orf3a, and the membrane protein. Secondly, we examined the PG4s on the negative-sense strand, which is an intermediate product of replication. Nine PG4s were scattered on the negative-sense strand. To further characterize the potential canonical secondary structures competitive with Gquadruplexes, the landscape of thermodynamic stability of the SARS-CoV-2 genome was depicted by using ΔG°z-score [55] . In general, a positive ΔG°z-score implies that the secondary structure of this region tends to be less stable than the randomly shuffled sequence with the identical nucleotide composition, while a negative ΔG°z-score signifies higher stability than the randomly shuffled sequence. For each nucleotide in the SARS-CoV-2 genome, the Δ G°z-score was calculated for all the 120 nt windows covering the nucleotide, and an average ΔG°z-score was deduced then. Several PG4s are located in positions with a locally higher average ΔG°z-scores (Fig. 2B ) which implied the relative instability of a canonical secondary structure and the lower possibility to adopt such a competitive structure against the G-quadruplex, which may ultimately favor the formation of G-quadruplex. 1 15 37 -GGTTGGTTTGTTACCTGGGAAGG -2 353 377 + GGCTTTGGAGACTCCGTGGAGGAG G nsp1 3 359 377 + GGAGACTCCGTGGAGGAGG nsp1 4 644 663 + GGTAATAAAGGAGCTGGTGG nsp1 5 2449 2472 -GGGGCTTTTAGAGGCATGAGTAGG -6 3467 3483 + GGAGGAGGTGTTGCAGG nsp3 7 4261 4289 + GGGTTTAAATGGTTACACTGTAGAG GAGG nsp3 8 4262 4289 + GGTTTAAATGGTTACACTGTAGAGG AGG nsp3 9 4886 4901 -GGTGGAATGTGGTAGG -10 6011 6027 -GGATATGGTTGGTTTGG -11 8687 8709 + GGATACAAGGCTATTGATGGTGG nsp4 12 10015 10030 -GGTTTGTGGTGGTTGG -13 10015 10039 -GGTGATAGAGGTTTGTGGTGGTTGG -14 10019 10039 -GGTGATAGAGGTTTGTGGTGG -15 10255 10282 + GGTACAGGCTGGTAATGTTCAACTC AGG nsp5 16 10261 10290 + GGCTGGTAATGTTCAACTCAGGGTT ATTGG nsp5 17 13385 13404 + GGTATGTGGAAAGGTTATGG nsp10 18 15924 15941 -GGATCTGGGTAAGGAAGG -19 18296 18318 + GGATTGGCTTCGATGTCGAGGGG nsp14 20 24268 In 2016, Chun Kit Kwok and co-workers profiled the RNA G-quadruplexes in the HeLa transcriptome by using the RNA G-quadruplex sequencing (rG4-seq) technology, and quantified the diversity of these RNA G-quadruplexes [56] . We set out to address the question of whether the potential G-quadruplexes in SARS-CoV-2 showed analogical features with the G-quadruplexes found in the human transcriptome and if these PG4s have the ability to form G-quadruplex structures. We noticed that the PG4s in SARS-CoV-2 are all in the two-quartet style. Therefore we retrieved the two-quartet RNA G-quadruplex sequence data generated in the rG4-seq experiment under the condition of K + and pyrdiostatin (PDS). However, for some RTS (Reverse Transcriptase Stalling) sites labeled as two-quartet, there may exist overlapping G-quadruplexes with different loops (e.g., GGCACAGCAGGCATCGGAGGTGAGGCGGGG), and it is difficult to determine which one was formed in the experiment. In order to eliminate the ambiguity, only the RTS sites containing nonoverlapping two-quartet G-quadruplex (e.g., GTCATTTTTTGTGTTTGGTTTGGTGGTGGC) were considered. Firstly, we investigated the loop length distribution pattern of the two-quartet PG4s in both SARS-CoV-2 and the human transcriptome (Fig. 3A) . As a whole, the two-quartet PG4s in SARS-CoV-2 and the human transcriptome displayed similar loop length distribution patterns, and the loop length of the PG4s in SARS-CoV-2 falls into the scope of the ones from the human transcriptome. The distributions of loop length between the SARS-CoV-2 PG4s and the human two-quartet Gquadruplexes did not show discrepancies (Fig. S1 , Wilcoxon test, p-value = 0.4552). Considering the fact that the presence of multiple-cytosine tracks may hinder the formation of Gquadruplexes [54, 57] , we examined the cytosine ratio in G-quadruplex loops (Fig. 3B ). No significant difference in loop cytosine ratios was observed between the SARS-CoV-2 PG4s and the human two-quartet G-quadruplexes (Wilcoxon test, p-value = 0.9911), which suggested that the loop cytosine ratios between the two types of G-quadruplex were similar. Taken together, our results suggested that the PG4s in SARS-CoV-2 displayed similar features to the rG4s in the human transcriptome. Recent research revealed that the G-quadruplexes in human UTRs (Untranslated Regions) are under selective pressures [58] , and some coronaviruses on bats and pangolins are closely related to SARS-CoV-2. The conservation of the potential G-quadruplexes in the SARS-CoV-2 genome under selective constraints were analyzed. We collected some beta coronavirus genomic sequences of bats and pangolins from several public databases and used the NJ (Neighbor-Joining) method to construct the phylogenetic tree with 1,000 bootstrap replications (Fig. S2 ). The RS (Rejected Substitutions) score for each site in the SARS-CoV-2 reference genome was evaluated by using the GERP++ software. We checked the RS score difference between the G-tract (continuous runs of G) nucleotides and other nucleotides. A significant discrepancy was observed, which means that the G-tracts nucleotides exhibit heightened selective constraints than other nucleotides in the SARS-CoV-2 genome (Fig. 4A , Wilcoxon test, p-value = 9.254 × 10 -8 ). Considering that the G-tracts are composed of guanines, the conservation of guanines in and outside the G-tracts in the SARS-CoV-2 genome were also compared. We found that the guanines in G-tracts are under heightened selective constraints (Fig. 4B , Wilcoxon test, p-value = 3.363×10 -3 ). The nucleotides within G-tracts are more relevant to the G-quadruplexes structural maintenance than loops. Then we compared the Gtract and loop RS scores. As expected, the G-tract RS scores were significantly higher than loops (Fig. 4C , Wilcoxon test, p-value = 3.962 × 10 -7 ), which suggests that the G-tracts experienced stronger selective constraints. We also checked that if the PG4s that are under heightened selective constraints is relevant to its inherent properties or potential functions rather than the sequence contexts. A random test was performed to check whether the fragments containing PG4s manifested different average RS scores compared with random fragments in the SARS-CoV-2 genome. The fragments containing PG4 were designated as the sequence 100 nt upstream and downstream of the PG4 centers. We conducted 1,000 rounds of tests. In each test, we randomly selected 50 fragments from the SARS-CoV-2 genome with a length of 200 nt and carried out the Wilcoxon test to assess the average RS score difference among the randomly selected fragments and the fragments containing PG4s. The p-value for each round was retained. As a result, no evident difference was observed as few p-values (13/1000) were less than 0.05 (Fig. 4D) , suggesting that PG4s that are under heightened selective constraints is more likely to be related to its inherent properties or potential functions rather than sequence contexts. Both SARS-CoV and SARS-CoV-2 could cause acute disease symptoms, and the above coronavirus shares similar nucleic acid sequence compositions. There is a SUD in the SARS-CoV genome that can binding to the G-quadruplex structures and it is unclear if the SARS-CoV-2 genome possess the resemble structure. Thus, we started to explore whether the SARS-CoV-2 genome contains the protein-coding sequence similar to SUD and whether SARS-CoV-2 retains the ability to bind RNA G-quadruplexes. We collected the ORF1ab amino acid sequences of some coronaviruses, including seven known coronaviruses, which can infect humans and other coronaviruses belonging to different genera. Surprisingly, the SUD protein sequence is absent in some coronaviruses, especially in alpha, gamma, and delta coronaviruses (Fig. S3 ). In contrast, the SUD protein sequence is retained in several beta-coronavirus, particularly in bat and pangolin associated beta coronavirus. Moreover, among the seven coronaviruses that can infect humans, only SRAS-CoV and SARS-CoV-2 keep the SUD sequence, while the SUD sequence in MERS-CoV, HCoV-229E, HCoV-NL63, HCoV-OC43 and HCoV-HKU1 is depleted. Next, we examined eight key amino acid residues in SUD that previously reported associated with G-quadruplex binding affinity (Fig. 5A) . Almost all the key amino acid residues are reserved in SARS-CoV-2, except one conservative replacement of K (Lysine) > R (Arginine). We hypothesized that if the G-quadruplex binding ability is essential for the SARS-CoV-2, the above amino acid residues should be conservative. We then investigated the conservation of the eight amino acid residues within SARS-CoV-2 samples. We retrieved the sequence alignment file of 16,466 SARS-CoV-2 samples from the GISAID database and calculated the mutation frequency for each nucleotide. We observed the frequency of nucleotide mutations in the above eight codons. As a result, a limited mutation frequency was found as compared to the whole genome average mutation frequency (Fig. 5B, frequency = 3 .96). Although eight mutations were detected in glutamate (2432 E), seven of them were synonymous mutations. Next, we checked the electrostatic potential pattern in the SARS-CoV-2 SUDcore-like dimer structure. The SARS-CoV-2 SUDcore-like dimer structure is defined as the dimer structure formed by the amino acid residues in SARS-CoV-2 corresponding to the SUD of SARS. We found that the SUDcore-like dimer of SARS-CoV-2 and the SUDcore of SARS present analogical electrostatic potential patterns. The positively charged patches were observed in the core of the SUDcore-like dimer, which was surrounded by negatively charged patches (Fig. 5C ). In contrast, when the dimer is rotated 180°, a slightly inclined narrow cleft with negative potential accompanied by the positively charged patches was discovered (Fig. 5D ). And the above patterns also appeared in the SUD dimer. In the previous reports, several positively charged patches located in the center and back of the dimer were presumed to bind the G-quadruplex structures. By comparison with the electrostatic potential of the SARS SUDcore dimer, we identified the positively charged patches located in the center and back of the SARS-CoV-2 SUDcore-like dimer, which can potentially bind the G-quadruplexes (Fig. 5C-D) . The COVID-19 pandemic has caused huge losses to humans and made people pay more attention to public health. A large number of scientists all over the world have been engaged in the fight against the outbreak. The SARS-CoV-2 coronavirus is the key culprit responsible for the outbreak, and no specific inhibitor drugs have been developed yet. G-quadruplexes have shown tremendous potential for the development of anticancer [59] [60] [61] [62] and antiviral drugs [44, 63, 64] , as Gquadruplexes can interfere with many biological processes that are critical to cancer cells and viruses. Therefore, it is necessary to quantify and characterize the PG4s in the SARS-CoV-2 genome to provide a possible novel method for the treatment of COVID-19. In this study, besides three popular G-quadruplexes prediction tools, the cG/cC scoring system, which is specially designed for the identification of RNA G-quadruplexes, was adopted to determine the PG4s. Indeed, we did not find the G-quadruplexes with three or more G-quartets, which are generally considered to be more stable than the two-quartet G-quadruplexes. One of the controversial issues lies on the stability of the two-quartet G-quadruplexes, especially the folding capability of those G-quadruplexes in vivo. However, it is well-acknowledged that the RNA Gquadruplexes is more stable than their DNA counterparts [65, 66] and SARS-CoV-2 is a single-strand RNA virus, which may be conducive to its structure formation. Several emerging studies have demonstrated the formation of two-quartet G-quadruplexes in viral sequences, which could serve as antiviral elements under the presence of G-quadruplex ligands [53, 67, 68] . Moreover, the K + (potassium ion), one of the primary positive ions inside human cells, can strongly support the formation of G-quadruplexes. Nevertheless, whether the SARS-CoV-2 G-quadruplexes could form in vivo requires overwhelming proofs. Most of the PG4s we detected were located in the positive-sense strand. The G-quadruplex forming sequences in the SARS-CoV genome were presumed to function as the chaperones of SUD, and their interaction was essential for the SARS-CoV genome replication [69] . ORF1ab that encodes the replicase proteins is required for the viral replication and transcription. Some PG4s were found to harbored in ORF1ab, and whether these PG4s were related to the replication of the viral genome and interact with SUD-like structures like in SARS-CoV, is worthy of further investigation. In addition to ORF1ab, there exists several PG4s in the structural and accessory protein-coding sequences as well as the sgRNAs that containing the above protein sequences. Some studies have characterized the impact of G-quadruplex structures on the translation of human transcripts, and an apparent inhibitory effect was observed [38, 57, 70] . The translation of some SARS-CoV-2 proteins requires the involvement of human ribosomes; thus, it is possible to repress the translation of SARS-CoV-2 proteins via stabilizing the G-quadruplex structures. In fact, this inhibition effect has been reported in some other viral studies [67, 71] . The negative-sense strand serves as templates for the synthesis of the positive-sense strand and the sub-genomic RNAs. The identified potential Gquadruplexes were broadly distributed in the negative-sense strand. Notably, we observed one PG4 located at the 3' end of the negative-sense strand. A previous study confirmed that the stable Gquadruplex structures located at the 3' end of the negative-sense strand could inhibit the RNA synthesis by reducing the activity of the RdRp (RNA-dependent RNA polymerase) [72] . Therefore, it is necessary to further investigate whether the PG4 at the 3' end of the negative-sense strand of SARS-CoV-2 could inhibit RNA synthesis. In addition, recent research revealed that the highfrequency trinucleotide mutations (G28881A, G2882A and G28883C) were detected in the SARS-CoV-2 genome [73, 74] . G28881A and G28882A always co-occur within the same codon, which means a positive selection of amino acid [75] . We noticed that the trinucleotide mutations were in the G-rich sequence from 28881 nt to 28917 nt (5' GGGGAACTTCTCCTGCTAGAATGGCTGGCAATGGCGG 3'). The potential G-quadruplex downstream of the trinucleotide mutations was filtered by the cG/cC score system as the presence of cytosine tracks within and flanking of the potential G-quadruplex reduce the cG/cC score; however, in fact, this potential G-quadruplex showed a relative lower MFE (Minimum Free Energy) among all the potential G-quadruplexes we detected. The consequence of the trinucleotide mutations was still elusive. Whether the mutations have an internal causality with the G-rich sequence still needs to be elucidated. The SUD in SARS, which is thought to be related to its terrible pathogenicity, has displayed binding preference to the G-quadruplexes in human transcripts [45] . Our analysis revealed that the novel coronavirus SARS-CoV-2 contained a similar domain to SUD as well. Furthermore, several amino acid residues previously reported to be an indispensable part of the G-quadruplexes binding capability are retained in SARS-CoV-2. Further exploration indicated that the eight key amino acid residues were conserved in numerous SARS-CoV-2 samples across countries all over the world, suggesting the essentiality of the above residues. It is supposed that the binding of SUD to Gquadruplexes could affect transcripts stability and translation, hence impairing the immune response of host cells. The expression of host genes in SARS-CoV-2 infected cells is extremely inhibited [15] ; therefore, we speculate that the SARS-CoV-2 may possess the similar mechanism with SARS-CoV that can inhibit the expression of some important immune-related genes to escape immune defense. Herein, we briefly depict the possible role of G-quadruplexes in the antiviral mechanism and pathogenicity, and the development of certain G-quadruplex specific ligands might be a promising antiviral strategy (Fig. 6) . We call for more researchers to shed light on the relationship between Gquadruplexes and coronaviruses. Only if we have a deeper understanding of coronaviruses can we better cope with the possible novel coronavirus pandemics in the future. Fig. 6 Possible role of G-quadruplexes in the antiviral mechanism and pathogenicity. Left part, Gquadruplexes can function as inhibition elements in the SARS-CoV-2 life cycle. Both the replication and translation could be affected by the G-quadruplexes structures. The stable G-quadruplexes in the 3' end of the negative-sense strand may interfere with the activity of RdRp; hence, the replication of the negative-sense strands to the positive-sense strands is repressed, so that the SARS-CoV-2 genomes cannot be produced in large quantities. The G-quadruplex structures can suppress the translation process by impairing the elongating of ribosomes, which can hinder the production of proteins required for the virus. The G-quadruplex structures could be stabilized by the specific ligands to enhance the inhibitory effects, which is a promising antiviral strategy. Right part, a possible mechanism for SARS-CoV-2 to impede the expression of human genes. G-quadruplex structures, particularly with longer G-stretches, are the potential binding targets for SUD-like proteins. And the interaction of the SUD-like proteins with G-quadruplex structures possibly lead to the instability of host transcripts or obstructing the translation efficiency. We obtained a total of 77 full-length bat-associated beta coronaviruses from the DBatVir (http://www.mgc.ac.cn/DBatVir/) database [76] . We also downloaded the bat coronavirus RaTG13 genome from the NCBI virus database (https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/), which has shown a high sequence similarity to the SARS-CoV-2 reference genome in previous reports. We acquired the SARS-CoV-2 reference genome from the NCBI virus database under the accession number of NC_045512. In addition to those sequences, nine pangolin coronaviruses were derived from GISAID (https://www.gisaid.org/) database [77] . The EMBOSS Needle software, which is based on the Needleman-Wunsch algorithm and, is a part of the EMBL-EBI web tools [78] , was employed for the pairwise sequence alignment. Clustal Omega [79, 80] is a reliable and accurate multiple sequence alignment (MSA) tool that can be performed on large data sets. We choose this MSA tool for the alignment of viral genomes and the alignment of protein sequences under the default paraments. UGENE [81] is a powerful and userfriendly bioinformatics software, and we choose UGENE to visualize the pairwise and multiple sequence alignment results. We used the MEGA X software [82] to construct the Neighbor-Joining phylogenetic tree with 1,000 bootstrap replications. To depict the conservation state for each nucleotide site, the GERP++ software [83] was applied to calculate the "Rejected Substitutions" score column by column, which can reflect the constraints strength for each nucleotide sites. Several open-source G-quadruplex detection software was used to search the PG4s both in the SARS-CoV-2 positive-sense and negative-sense strands. G4CatchAll [84] , pqsfinder [85] , and QGRS Mapper [86] were employed to predict the putative G-quadruplexes, respectively; Please see ref [87] for more information about the comparison of those tools mentioned above. The minimum G-tract length was set to two in the three software, while the max length of the predicted Gquadruplexes was limited to 30. Specifically, the minimum score of the predicted G-quadruplex was set to 10 when using pqsfinder. We utilized BEDTools [88] to sort the PG4s according to their coordinates. Apart from this, we adopted the cG/cC scoring system [54] proposed by Jean-Pierre Perreault et al. to delineate the sequence context influence on PG4s. The PG4s along with 15 nt upstream and downstream sequence contexts were used to calculate cG/cC score, and 2.05 was taken as the threshold for the preliminary inference of the G-quadruplex folding capability [54] . Using a customized python script, we implemented the cG/cC scoring system. The SARS-CoV-2 SUD core-like homo-dimer structure was modeled based on the template of the SARS-CoV SUD structure (PDB ID: 2W2G) through homology modeling. All the modeling process were performed in the Swiss Model [89] website (https://swissmodel.expasy.org/). The electrostatic potential was calculated and visualized in the PyMOL software by using the APBS (Adaptive Poisson-Boltzmann Solver) plugin. The Δ G°z-score for the SARS-CoV-2 genome was retrieved from RNAStructuromeDB (https://structurome.bb.iastate.edu/sars-cov-2). The ΔG°z-score is described as follows. Where the means the MFE (minimum free energy) ΔG° value predicted by the RNAfold software with a window of 120 nt and step of one nt. And the ������ represents the MFE ΔG° value generated by the randomly shuffled sequence with the identical nucleotide composition. The is the standard deviation across all the MFE values. To depict the ΔG°z-score for each nucleotide in the SARS-CoV-2 genome, we utilized the following formula. Where z is the average ΔG°z-score for nucleotide , denotes the total number of the sliding windows that covering the nucleotide . ΔG°z − score indicates the ΔG°z-score for the mth window. For example, when considering the nucleotide 1000 under the setting of 120 nt window length and one nt step, there are 120 sliding windows covering the nucleotide 1000. So, the z 200 , which means the average ΔG°z-score for nucleotide 200, is calculated as the sum of theΔG°zscore of 120 sliding windows divided by the total number of the sliding windows. This work was supported by the National Natural Science Foundation of China (61972084). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges Origin and evolution of pathogenic coronaviruses Epidemiology and cause of severe acute respiratory syndrome (SARS) in Guangdong, People's Republic of China Severe acute respiratory syndrome Coronaviruses -drug discovery and therapeutic options Coronaviridae Study Group of the International Committee on Taxonomy of, The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2 Clinical Characteristics of Coronavirus Disease 2019 in China The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak Coronavirus Disease 2019 (COVID-19): A Perspective from China Virology, Epidemiology, Pathogenesis, and Control of COVID-19 CRISPR-Cas12-based detection of SARS-CoV-2 SARS-CoV-2: an Emerging Coronavirus that Causes a Global Threat The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak -an update on the status The Architecture of SARS-CoV Emerging coronaviruses: Genome structure, replication, and pathogenesis SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein COVID-19 infection: Origin, transmission, and characteristics of human coronaviruses Probable Pangolin Origin of SARS-CoV-2 Associated with the COVID-19 Outbreak Identifying SARS-CoV-2 related coronaviruses in Malayan pangolins The proximal origin of SARS-CoV-2 A pneumonia outbreak associated with a new coronavirus of probable bat origin Isolation of SARS-CoV-2-related coronavirus from Malayan pangolins DNA secondary structures: stability and function of Gquadruplex structures G-Quadruplexes: Prediction, Characterization, and Biological Application The regulation and functions of DNA and RNA G-quadruplexes The Structure and Function of DNA G-Quadruplexes Involvement of G-quadruplex regions in mammalian replication origin activity G-Quadruplexes in DNA Replication: A Problem or a Necessity? Human Origin Recognition Complex Binds Preferentially to G-quadruplexpreferable RNA and Single-stranded DNA G4 motifs affect origin positioning and efficiency in two vertebrate replicators Telomere DNA G-quadruplex folding within actively extending human telomerase G-quadruplex formation at the 3' end of telomere DNA inhibits its extension by telomerase, polymerase and unwinding by helicase Telomeric G-quadruplexes are a substrate and site of localization for human telomerase Regulation of Telomere Length by G-Quadruplex Telomere DNA-and TERRA-Binding Protein TLS/FUS G-quadruplex preferentially forms at the very 3′ end of vertebrate telomeric DNA An RNA G-quadruplex in the 5′ UTR of the NRAS proto-oncogene modulates translation UTR of the BAG-1 mRNA affects both its cap-dependent and cap-independent translation through global secondary structure maintenance A Gquadruplex structure within the 5′-UTR of TRF2 mRNA represses translation in human cells RNA Gquadruplexes at upstream open reading frames cause DHX36-and DHX9-dependent translation of human mRNAs More Than Just a Kink in Microbial Genomes G-quadruplexes in viruses: function and potential therapeutic applications G-quadruplexes and G-quadruplex ligands: targets and tools in antiviral therapy The SARS-unique domain (SUD) of SARS coronavirus contains two macrodomains that bind G-quadruplexes LTR Reveals a (3 + 1) Folding Topology Containing a Stem-Loop U3 Region in the HIV-1 Genome Adopts a G-Quadruplex Structure in Its RNA and DNA Sequence HIV-1 Nucleocapsid Protein Unfolds Stable RNA G-Quadruplexes in the Viral Genome and Is Inhibited by G-Quadruplex Ligands Anti-HIV-1 activity of the G-quadruplex ligand BRACO-19 Zika Virus Genomic RNA Possesses Conserved G-Quadruplexes Characteristic of the Flaviviridae Family The effect of single nucleotide polymorphisms in G-rich regions of high-risk human papillomaviruses on structural diversity of DNA Chemical Targeting of a G-Quadruplex RNA in the Ebola Virus L Gene New scoring system to identify RNA G-quadruplex folding An