key: cord-0915350-u9ydszmf authors: Yong, Hoi-Sen; Song, Sze-Looi; Chua, Kah-Ooi; Wayan Suana, I.; Eamsobhana, Praphathip; Tan, Ji; Lim, Phaik-Eem; Chan, Kok-Gan title: Complete mitochondrial genomes and phylogenetic relationships of the genera Nephila and Trichonephila (Araneae, Araneoidea) date: 2021-05-21 journal: Sci Rep DOI: 10.1038/s41598-021-90162-1 sha: 2e38273c391e14f07220d54bceb5402df3cd989c doc_id: 915350 cord_uid: u9ydszmf Spiders of the genera Nephila and Trichonephila are large orb-weaving spiders. In view of the lack of study on the mitogenome of these genera, and the conflicting systematic status, we sequenced (by next generation sequencing) and annotated the complete mitogenomes of N. pilipes, T. antipodiana and T. vitiana (previously N. vitiana) to determine their features and phylogenetic relationship. Most of the tRNAs have aberrant clover-leaf secondary structure. Based on 13 protein-coding genes (PCGs) and 15 mitochondrial genes (13 PCGs and two rRNA genes), Nephila and Trichonephila form a clade distinctly separated from the other araneid subfamilies/genera. T. antipodiana forms a lineage with T. vitiana in the subclade containing also T. clavata, while N. pilipes forms a sister clade to Trichonephila. The taxon vitiana is therefore a member of the genus Trichonephila and not Nephila as currently recognized. Studies on the mitogenomes of other Nephila and Trichonephila species and related taxa are needed to provide a potentially more robust phylogeny and systematics. www.nature.com/scientificreports/ this study (Table S2 ; Fig. S1 ). All the present three mitogenomes (N. pilipes, T. antipodiana and T. vitiana) have 13 PCGs, two rRNA genes, 22 tRNAs, a non-coding A + T rich control region, and a large number of intergenic sequences (spacers and overlaps) ( Table 1; Table S2 ; Fig. 1 ). Besides, all three mitogenomes of N. pilipes, T. antipodiana, and T. vitiana are AT-rich (Table 2 ). These mitogenomes have negative values for AT skewness and positive values for GC skewness indicating the bias toward the use of Gs over Cs. Although an overall negative AT skewness value and positive GC skewness value are observed for the whole mitogenomes, they are variable for individual genes in different mitogenomes ( Table 2 ). The A + T content for the N strand in the Nephila and Trichonephila mitogenomes is slightly higher than that for the J strand: with negative skewness value for the J strand and positive skewness value for the N strand ( Table 2 ). The GC skewness value is positive for both the J and N strands, with the respective values for the J strand higher than those of the N strand. The mitogenomes of both Nephila and Trichonephila are characterized by many more intergenic overlaps than spacers (Table 1; Table S2 ). The longest spacer in N. pilipes (19 bp) is between trnL1 and rrnL as well as between rrnL and trnV; that in T. antipodiana (24 bp) is between rrnL and trnV; that in T. vitiana (32 bp) between rrnL and trnV; and that in T. clavata (48 bp) between cox1 and cox2. The respective largest overlaps were: − 29 bp A larger number of intergenic overlaps than spacers is also evident in the mitogenomes of other spiders: Tetragnatha maxillosa, and Tet. nitens (Tetragnathidae) 16 ; Epeus alboguttatus (Salticidae) 17 ; Wadicosa fidelis (Lycosidae) 18 ; Ebrechtella tricuspidata (Thomisidae) 19 ; Lyrognathus crotalus (Theraphosidae) 20 ; and Cheiracanthium trivale (Cheiracanthidae), and Dystera silvatica (Dysteridae) 21 . Protein-coding genes and codon usage. The A + T content for PCGs ranges from 69.7% for cox3 to 82.0% for atp8 in N. pilipes, 71.3% for cox1 to 83.4% for atp8 in T. antipodiana, 71.7% for cox3 to 81.4% for atp8 in T. vitiana, and 71.3% for cox3 to 83.4% for atp8 in T. clavata (Table S3) (Table 1; Table S2 ). Two complete stop codons (TAA and TAG) are present in the Nephila and Trichonephila mitogenomes. In addition, T. clavata has a truncated incomplete T stop codon. ATT is the commonest start codon in N. pilipes Table 1) . The most common start codon with ATA in other spiders includes Tet. maxillosa (5 PCGs) and Tet. nitens (5 PCGs) 16 ; D. silvatica (6 PCGs) 21 ; E. alboguttatus (5 PCGs) 17 ; W. fidelis (5 PCGs) 18 ; and E. tricuspidata (7 PCGs) 19 . Spiders with ATT as the most common start codon include: C. trivale (5 PCGs) 21 ; L. crotalus (6 PCGs) 20 ; Araneus ventricosus (Araneidae) (7 PCGs) 22 ; Argiope ocula (Araneidae) (4 PCGs) 23 ; Habronattus oregonensis (Salticidae) (6 PCGs) 24 ; and Argyroneta aquatica (Cybaeidae) (6 PCGs) 25 . In six species of Dysteridae spiders, ATA is the commonest start codon in only one species (Parachtes teruelis); the other species have ATT as the commonest start codon 26 . TAA is the commonest stop codon in N. pilipes (9 PCGs), T. antipodiana (10 PCGs), T. vitiana (11 PCGs) , and T. clavata (9 PCGs), excepting: TAG for cob, nad1 and nad5 in N. pilipes; nad1, nad2 and cob in T. antipodiana; cob and nad1 in T. vitiana; and nad2, nad6, cob and nad1 in T. clavata (Table 1; Table S2 ). TAA has been reported to be the most common stop codon in A. ventricosus (9 PCGS) 24 , Neoscona scylla (Araneidae) (12 PCGs) 27 , Tet. maxillosa (8 PCGs) and Tet. nitens (10 PCGs) 16 , E. alboguttatus (8 PCGs) 17 , Evarcha coreana (Salticidae) (9 PCGs) 28 , W. fidelis (7 PCGs) 18 , E. tricuspidata (5 PCGs) 19 , Uroctea compactilis (Oecobiidae) (6 PCGs) 29 , C. triviales (7 PCGs) and D. silvatica (7 PCGs) 21 , L. crotalus (8 PCGs) 20 , H. oregonensis (5 PCGs) 24 , A. aquatica (4 PCGs and 6 truncated T) 25 , Mesabolivar sp. 1 (Phocidae) (8 PCGs) and Mesabolivar sp. 2 (11 PCGs) 30 , and E. alboguttatus (8 PCGs) 17 . In the present study, truncated incomplete stop codon (T) is detected only for cox3 in T. clavata (Table 1; Table S2 ). No incomplete stop codon has been reported for L. crotalus 20 . Truncated stop codons are however not uncommon in the animal world. Examples of spider mitogenomes with incomplete T stop codons are: E. tricuspidata 19 ; Tet. maxillosa and Tet. nitens 16 21 . Incomplete stop codons are presumed to be completed by post-translational polyadenylation 32 . The frequency of individual amino acid varies among the congeners of Trichonephila as well as the genera Nephila and Trichonephila (Fig. 2) . However, the most frequently utilized codons are highly similar in these mitogenomes. The predominant amino acids (with frequency above 200) in all the four mitogenomes are isoleucine (Ile), leucine2 (Leu2), methionine (Met), phenylalanine (Phe), serine2 (Ser2), and valine (Val) ( Table S4) . Analysis of the relative synonymous codon usage (RSCU) reveals the biased usage of A/T than G/C at the third codon position (Fig. 2) . The frequency of each codon is very similar across all the four spider mitogenomes. Table S5 ). Similar finding has been reported for 17 spider mitogenomes 20 . The sequence of the Ka/Ks ratio (cox1 < cox2 < cob < cox3 < nad1 < nad4 < atp6 < na d5 < nad4L < nad3 < nad2 < nad6 < atp8) in Nephila and Trichonephila species differs from that of (cox1 < nad1 < c ox2 < nad5 < cob < cox3 < nad4 < atp6 < nad4L < nad3 < nad2 < nad6 < atp8) reported for 17 spider mitogenomes 20 . The cox1 gene with the lowest Ka/Ks ratio in spider mitogenomes, representing fewer changes in amino acids, supports its use as a molecular marker for species differentiation and DNA barcoding 33, 34 . Ribosomal RNA genes. Of the two rRNA genes in Nephila and Trichonephila mitogenomes, rrnS is much shorter, ranging from 693 bp in N. pilipes to 702 bp in T. antipodiana, while rrnL ranges from 1042 bp in T. antipodiana to 1050 bp in T. vitiana (Table 1, Table S2 ). As in other araneid spiders, rrnL is located between trnL1 and trnV and rrnS between trnV and trnQ ( Fig. 1; Fig. S1 ). Both the rRNA genes of the complete mitogenome are AT-rich ( Table 2) Most of the tRNAs in Nephila and Trichonephila mitogenomes have aberrant clover-leaf secondary structure, including truncated aminoacyl acceptor stem and mismatched (lacking well-paired) aminoacyl acceptor stem (Fig. 4) . Sixteen tRNAs in the Nephila and Trichonephila mitogenomes do not possess a TΨC arm: seven in N. pilipes and 10 each in T. antipodiana, T. vitiana and T. clavata (Fig. 4) . There are also tRNAs with complete loss of TΨC stem (trnD in N. pilipes; trnV in T. antipodiana; and trnK in T. clavata) and complete loss of TΨC loop (trnR and trnQ in N. pilipes and trnK in T. vitiana). Two tRNAs (trnA, trnS2) do not have DHU arm in all the Nephila and Trichonephila mitogenomes. Other tRNAs without DHU arm are: trnR in N. pilipes; and trnS1 and trnT in T. clavata. The complete loss of DHU loop involves trnQ in N. pilipes, trnN and trnV in T. antipodiana and T. clavata, and trnV in T. vitiana (Fig. 4) . Many tRNAs in spider mitogenomes have been reported to lack a well-paired aminoacyl acceptor stem, a TΨC arm, and a DHU arm 35 . None of the 22 tRNA sequences in H. oregonensis mitogenome have the potential to form a fully paired, seven-member aminocyl acceptor stem 24 . Mismatched aminoacyl acceptor stem has been reported to be a shared characteristic among spider mitogenomes 35 . It has been postulated that the missing 3ʹ acceptor stem sequence is post-translationally modified by the RNA-editing mechanism 24 . In A. aquatica mitogenome, the tRNAs are characterized by mismatched aminoacyl acceptor stem, and excepting trnS1 and trnS2 (both with only TΨC loop), the remaining tRNAs lack a TΨC arm 25 . The armless tRNA secondary structures are conserved across the family Dysderidae 36 . and T. vitiana (511 bp) is much shorter than that of T. clavata (848 bp) ( Table 1; Table S2 ). Spider mitogenomes with less than 800 bp for the control region include: N. nautica (455 bp) and N. doenitzi (566 bp) 31 ; E. coreana 16 ; and A. aquatica (2047 bp) 25 . The A + T content of the control region of Nephila and Trichonephila mitogenomes is AT-rich (Table 2) , with negative AT skewness value in T. antipodiana and positive values in N. pilipes, T. vitiana and T. clavata (Table S3 ). The GC skewness value is positive for all four mitogenomes. The control region of Nephila and Trichonephila mitogenomes is characterized by: (i) many simple tandem repeats and palindrome; (ii) long poly-nucleotide; and (iii) several stem-loop structures in these spider mitogenomes. The presence of 15 tandem repeats of ATAGA motif with TAT ATA CATAT stretch (except one each with TAT, TAT GTA CATAT, and TAT ATA CATAA) in T. clavata (Fig. 5) is a unique feature for this orb-weaving spider. Five 135-bp tandem repeats and two 363-bp tandem repeats have been identified in the putative control region of A. aquatica 25 . A long tandem repeat region comprising three full 215 bp and a partial 87 bp is present in the control region of W. fidelis mitogenome 18 . Phylogenetic analysis. An early study based on one nuclear (18S) and two mitochondrial (COXI and 16S) markers revealed that N. pilipes and N. constricta Karsch, 1879 formed a clade that was sister to all other Nephila species 37 . This finding was supported by molecular phylogenetic study based on three nuclear and five mitochondrial genes which indicated that the genus Nephila was diphyletic, with true Nephila (containing N. pilipes and N. constricta) and the other species (now genus Trichonephila according to Kuntner et al. 1 ) being sister to the genus Clitaetra Simon, 1889 38 The present phylogenetic trees based on 13 PCGs and 15 mt-genes (13 PCGs and 2 rRNA genes) reveal identical topology with very good nodal support based on ML and BI methods (Fig. 6, Fig. S2 ). The genera Nephila and Trichonephila form a clade distinct from other genera of Araneidae. T. antipodiana and T. vitiana are closer related in the lineage containing also T. clavata, while N. pilipes is distinctly separated from these Trichonephila species. The araneid subfamilies Araneinae (genera Araneus, Cyclosa, Hypsosinga and Neoscona), Argiopinae (genus Argiope), Cyrtarachninae (genus Cyrtarachna) and Cyrtophorinae (genus Cyrtophora) form a clade distinct from the Nephila-Trichonephila clade. Araneinae does not form a monophyletic group, with the genus Cyclosa being basal to the other Araneinae genera (Araneus, Hypsosinga and Neoscona), as well as the monophyletic subfamilies Argiopinae and Cyrtophorinae ( Fig. 6; Fig. S2 ). Argiopinae and Cyrtophorinae form a lineage distinct from the Araneinae lineages comprising Neoscona and (Araneus-Hypsosinga), Cyrtarachninae is basal to the above araneid subfamilies. A large, representative taxonomic sampling is needed to reconstruct a robust phylogeny. Both the BI and ML trees based on two rRNA (rrnL and rrnS) sequences reveal identical clades as 15 mt-genes and 13 PCGs ( Fig. 6; Fig. S2 ). However, the genera Araneus and Argiope do not form monophyletic lineages, and the genus Cyclosa is the most basal genus to the other araneid genera. This result indicates that the rRNA genes alone are not suitable for reconstructing phylogeny at the higher taxonomic level. In a recent study based on 13 protein-coding genes of the complete mitogenome, Nephilidae (represented by T. clavata) is basal to the family Araneidae 19 . Our present study, with the inclusion of N. pilipes, T. antipodiana and T. vitiana (previously N. vitiana) as well as T. clavata and additional recently published mitogenomes of Araneidae supports the Nephila-Trichonephila clade being basal to other araneid subfamilies ( Fig. 6; Fig. S2 ). The close affinity of T. vitiana with T. antipodiana and T. clavata indicates that it is a member of the genus Trichonephila and not Nephila as currently recognized 2 . The close affinity between T. antipodiana and T. vitiana is also reflected by their genetic distance: 8.65% based on 13 PCGs and 8.62% based on 15 mt-genes. On the other hand, the genetic distance between T. vitiana and N. pilipes is 21.68% based on 13 PCGs and 21.56% based on 15 mt-genes. Based on 15 mt-genes, the genetic distance between Trichonephila species ranges from 8.62 to 13.41% (Table S6) . Studies based on morphological data and mitochondrial and nuclear gene sequences have indicated closer relationship of T. antipodiana with T. clavata than with N. pilipes [37] [38] [39] . Based on anchored hybrid enrichment (AHE) targeted-sequencing approach with 585 single copy orthologous loci, the genus Nephila is basal to the genera Herennia Thorell, 1877, Nephilengys L. Koch, 1872, Nephilingis Kuntner, 2013, Trichonephila and Clitaetra 1 . The genus Clitaetra is basal to the genera Herennia, Nephilengys, Nephilingis, and Trichonephila. Mitochondrial genomes have been applied particularly to studies regarding phylogeny and evolution of insects 40 . A recent study on spider mitogenomes covered only 12 species of Araneidae: 1 species of Trichonephila, Analysis of mitogenome. Raw sequence reads were obtained from the MiSeq system in FASTQ format. The overall quality of the sequences was assessed from their Phred scores using FastQC software 43 . Ambiguous nucleotides and raw sequence reads with lower than Q20 Phred score were trimmed and removed using CLC genomic workbench v.7.0.4 (Qiagen, Germany). Quality-filtered DNA sequences were mapped against the reference mitogenome T. clavata (NC_008063), before a de novo assembly was performed on the mapped DNA sequences. Contigs larger than 13 kbp were extracted for a BLAST search against NCBI nucleotide database to identify the mitochondrial genome of the spider species 41 . On the other hand, demultiplexed raw sequence reads that were free of sequencing adapter were subjected for de novo assembly using NOVOplasty with different lengths of k-mer 44 . The assembled genomes from both softwares were aligned and examined for terminal repeats to evaluate their circularity and completeness. The mitogenome sequences of N. pilipes, T. antipodiana and T. vitiana (previously N. vitiana) have been deposited in GenBank under the accession numbers MW178204, MW178205 and MW178206, respectively. Gene annotation, visualization and comparative analysis. The assembled mitogenomes were submitted to MITOS web-server (http:// mitos. bioinf. uni-leipz ig. de/ index. py) for an initial gene annotation 45 . The coding regions of protein coding genes (PCGs), transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs) were further validated using nucleotide-nucleotide BLAST (BLASTn) and protein-protein BLAST (BLASTp) 46 against the reference mitogenome of T. clavata (NC_008063). For tRNA genes that were not identified, we extracted the DNA sequences of their putative coding regions for an additional Infernal prediction with maximum overlap increased to 50 26 . The gene boundaries as well as the start and stop codons of PCGs were determined following multiple sequence alignment using ClustalW 47 . The overlapping and intergenic spacer regions were curated manually 21 . The nucleotide composition, amino acid frequency and relative synonymous codon usage (RSCU) in the complete mitogenomes were calculated in MEGA X 48 . The ratios of non-synonymous substitutions (Ka) and synonymous (Ks) substitutions for all PCGs were estimated in DnaSP6.0 49 . The skewness of the mitogenomes was determined from formulae: AT skew = (A − T)/(A + T) and GC skew = (G − C)/(G + C) 50 . Inverted repeats or palindromes in the control region were checked using Tandem Repeats Finder (http:// tandem. bu. edu/ trf/ trf. html) 51 . The circular mitogenomes of the spiders were visualized using Blast Ring Image Generator (BRIG) 52 . Golden orbweavers ignore biological rules: phylogenomic and comparative analyses unravel a complex evolution of sexual size dimorphism World Spider Catalog (2020) Natural History Museum Bern Discovery of the largest orbweaving spider species: the evolution of gigantism in Nephila The systematics and biology of the spider genus Nephila (Araneae: Nephilidae) in the Australasian region Phylogenetic systematics of the Gondwanan nephilid spider lineage Clitaetrinae (Araneae, Nephilidae) Rounding up the usual suspects: a standard target-gene approach for resolving the interfamilial phylogenetic relationships of ecribellate orb-weaving spiders with a new family-rank classification (Araneae, Araneoidea) Monophyly, taxon sampling, and the nature of ranks in the classification of orb-weaving spiders (Araneae: Araneoidea) Taxonomic account of the genus Nephila (Araneae: Nephilidae) of Bangladesh Phylogeny of extant nephilid orb-weaving spiders (Araneae, Nephilidae): testing morphological and ethological homologies Colour variation and polymorphism in the Giant orb-weaving spider Nephila vitiana (Araneae: Nephilidae) from Lombok, Indonesia Abdominal colour polymorphism in female Asian golden web spider Nephila antipodiana (Araneae: Nephilidae) Mitteilungen aus dem Museum für Naturkunde in Berlin. Zoologisches Museum und Institut für Spezielle Zoologie The complete mitochondrial genome of Nephila clavata (Araneae: Nephilidae) Chinese population Complete mitochondrial genome and phylogenetic analysis of Argiope perforata (Araneae: Araneidae) Characterization of the complete mitochondrial genome of Cyclosajaponica (Araneae: Araneidae) The complete mitochondrial genome of two Tetragnatha spiders (Araneae: Tetragnathidae): severe truncation of tRNAs and novel gene rearrangements in Araneae The complete mitochondrial genome of Epeus alboguttatus (Araneae: Salticidae) The complete mitochondrial genome of the wolf spider Wadicosa fidelis (Araneae: Lycosidae) Complete mitochodrial genome of the crab spider Ebrechtella tricuspidata (Araneae: Thomisidae): a novel tRNA rearrangement and phylogenetic implications for Araneae The Complete Mitochondrial Genome of endemic giant tarantula, Lyrognathus crotalus (Araneae: Theraphosidae) and comparative analysis The gene arrangement and phylogeny using mitochondrial genomes in spiders (Arachnida: Araneae) The complete mitochondrial genome of orb-weaving spider Araneus ventricosus (Araneae: Araneidae) The complete mitochondrial genome of Argiope ocula (Araneae: Araneidae) and its phylogeny The complete mitochondrial genome sequence of the spider Habronattus oregonensis reveals rearranged and extremely truncated tRNAs The mitochondrial genome of the water spider Argyroneta aquatica (Araneae: Cybaeidae) Arm-less mitochondrial tRNAs conserved for over 30 millions of years in spiders Characterization of the complete mitochondrial genome sequence of Neoscona scylla and phylogenetic analysis Characterization of complete mitochondrial genome of Evarcha coreana (Araneae: Salticidae) Complete mitogenome and phylogenetic position of Uroctea compactilis (Arachnida: Oecobiidae). Mitochondrial DNA B 4 Complete mitochondrial genomes of three troglophile cave spiders (Mesabolivar, pholcidae) Characterization of the complete mitogenomes of two Neoscona spiders (Araneae: Araneidae) and its phylogenetic implications tRNA punctuation model of RNA processing in human mitochondria Assembling a DNA barcode reference library for the spiders (Arachnida: Araneae) of Pakistan Towards a DNA barcode reference database for spiders and harvestmen of Germany Parallel evolution of truncated transfer RNA genes in arachnid mitochondrial genomes On the shoulder of giants: mitogenome recovery from non-targeted genome projects for phylogenetic inference and molecular evolution studies Biogeography and speciation patterns of the golden orb spider genus Nephila (Araneae: Nephilidae) in Asia A molecular phylogeny of nephilid spiders: evolutionary history of a model lineage A revised phylogenetic analysis for the spider genus Clitaetra Simon, 1889 (Araneae, Araneoidea, Nephilidae) with the first description of the male of the Sri Lankan species Clitaetra thisbe Simon Insect mitochondrial genomics: implications for evolution and phylogeny Complete mitochondrial genome of three Bactrocera fruit flies of subgenus Bactrocera (Diptera: Tephritidae) and their phylogenetic implications Mitochondrial genome supports sibling species of Angiostrongylus costaricensis (Nematoda: Angiostrongylidae) A quality control tool for high throughput sequence data NOVOPlasty: de novo assembly of organelle genomes from whole genome data MITOS: improved de novo metazoan mitochondrial genome annotation Basic local alignment search tool Sequence alignment and homology search with BLAST and ClustalW MEGA X: molecular evolutionary genetics analysis across computing platforms DNA sequence polymorphism analysis of large data sets Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes Tandem repeats finder: a program to analyze DNA sequences BLAST ring image generator (BRIG): simple prokaryote genome comparisons MAFFT multiple sequence alignment software version 7: improvements in performance and usability PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies ModelFinder: fast model selection for accurate phylogenetic estimates Estimating the dimension of a model IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies Kakusan: a computer program to automate the selection of a nucleotide substitution model and the configuration of a mixed model on multilocus data MRBAYES: Bayesian inference of phylogenetic trees Tree Figure Drawing Tool The authors declare no competing interests. Supplementary Information The online version contains supplementary material available at https:// doi. org/ 10. 1038/ s41598-021-90162-1.Correspondence and requests for materials should be addressed to S.-L.S.Reprints and permissions information is available at www.nature.com/reprints.Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.