Complete mitochondrial genomes and phylogenetic relationships of the genera Nephila and Trichonephila (Araneae, Araneoidea)

Spiders of the genera Nephila and Trichonephila are large orb-weaving spiders. In view of the lack of study on the mitogenome of these genera, and the conflicting systematic status, we sequenced (by next generation sequencing) and annotated the complete mitogenomes of N. pilipes, T. antipodiana and T. vitiana (previously N. vitiana) to determine their features and phylogenetic relationship. Most of the tRNAs have aberrant clover-leaf secondary structure. Based on 13 protein-coding genes (PCGs) and 15 mitochondrial genes (13 PCGs and two rRNA genes), Nephila and Trichonephila form a clade distinctly separated from the other araneid subfamilies/genera. T. antipodiana forms a lineage with T. vitiana in the subclade containing also T. clavata, while N. pilipes forms a sister clade to Trichonephila. The taxon vitiana is therefore a member of the genus Trichonephila and not Nephila as currently recognized. Studies on the mitogenomes of other Nephila and Trichonephila species and related taxa are needed to provide a potentially more robust phylogeny and systematics.

www.nature.com/scientificreports/ this study (Table S2; Fig. S1). All the present three mitogenomes (N. pilipes, T. antipodiana and T. vitiana) have 13 PCGs, two rRNA genes, 22 tRNAs, a non-coding A + T rich control region, and a large number of intergenic sequences (spacers and overlaps) ( Table 1; Table S2; Fig. 1). Besides, all three mitogenomes of N. pilipes, T. antipodiana, and T. vitiana are AT-rich (Table 2). These mitogenomes have negative values for AT skewness and positive values for GC skewness indicating the bias toward the use of Gs over Cs. Although an overall negative AT skewness value and positive GC skewness value are observed for the whole mitogenomes, they are variable for individual genes in different mitogenomes ( Table 2). The A + T content for the N strand in the Nephila and Trichonephila mitogenomes is slightly higher than that for the J strand: with negative skewness value for the J strand and positive skewness value for the N strand ( Table 2). The GC skewness value is positive for both the J and N strands, with the respective values for the J strand higher than those of the N strand.
Protein-coding genes and codon usage. The A + T content for PCGs ranges from 69.7% for cox3 to 82.0% for atp8 in N. pilipes, 71.3% for cox1 to 83.4% for atp8 in T. antipodiana, 71.7% for cox3 to 81.4% for atp8 in T. vitiana, and 71.3% for cox3 to 83.4% for atp8 in T. clavata (Table S3) (Table 1; Table S2). Two complete stop codons (TAA and TAG) are present in the Nephila and Trichonephila mitogenomes. In addition, T. clavata has a truncated incomplete T stop codon. ATT is the commonest start codon in N. pilipes  Table 1).
In the present study, truncated incomplete stop codon (T) is detected only for cox3 in T. clavata (Table 1;  Table S2). No incomplete stop codon has been reported for L. crotalus 20 . Truncated stop codons are however not uncommon in the animal world. Examples of spider mitogenomes with incomplete T stop codons are: E. tricuspidata 19 ; Tet. maxillosa and Tet. nitens 16 21 . Incomplete stop codons are presumed to be completed by post-translational polyadenylation 32 .
The frequency of individual amino acid varies among the congeners of Trichonephila as well as the genera Nephila and Trichonephila (Fig. 2). However, the most frequently utilized codons are highly similar in these mitogenomes. The predominant amino acids (with frequency above 200) in all the four mitogenomes are isoleucine (Ile), leucine2 (Leu2), methionine (Met), phenylalanine (Phe), serine2 (Ser2), and valine (Val) ( Table S4).
The cox1 gene with the lowest Ka/Ks ratio in spider mitogenomes, representing fewer changes in amino acids, supports its use as a molecular marker for species differentiation and DNA barcoding 33,34 .
Ribosomal RNA genes. Of the two rRNA genes in Nephila and Trichonephila mitogenomes, rrnS is much shorter, ranging from 693 bp in N. pilipes to 702 bp in T. antipodiana, while rrnL ranges from 1042 bp in T. antipodiana to 1050 bp in T. vitiana (Table 1, Table S2). As in other araneid spiders, rrnL is located between trnL1 and trnV and rrnS between trnV and trnQ ( Fig. 1; Fig. S1).
Both the rRNA genes of the complete mitogenome are AT-rich ( Table 2) Most of the tRNAs in Nephila and Trichonephila mitogenomes have aberrant clover-leaf secondary structure, including truncated aminoacyl acceptor stem and mismatched (lacking well-paired) aminoacyl acceptor stem (Fig. 4).
Sixteen tRNAs in the Nephila and Trichonephila mitogenomes do not possess a TΨC arm: seven in N. pilipes and 10 each in T. antipodiana, T. vitiana and T. clavata (Fig. 4). There are also tRNAs with complete loss of TΨC stem (trnD in N. pilipes; trnV in T. antipodiana; and trnK in T. clavata) and complete loss of TΨC loop (trnR and trnQ in N. pilipes and trnK in T. vitiana).
Two tRNAs (trnA, trnS2) do not have DHU arm in all the Nephila and Trichonephila mitogenomes. Other tRNAs without DHU arm are: trnR in N. pilipes; and trnS1 and trnT in T. clavata. The complete loss of DHU loop involves trnQ in N. pilipes, trnN and trnV in T. antipodiana and T. clavata, and trnV in T. vitiana (Fig. 4).
Many tRNAs in spider mitogenomes have been reported to lack a well-paired aminoacyl acceptor stem, a TΨC arm, and a DHU arm 35 . None of the 22 tRNA sequences in H. oregonensis mitogenome have the potential to form a fully paired, seven-member aminocyl acceptor stem 24 . Mismatched aminoacyl acceptor stem has been reported to be a shared characteristic among spider mitogenomes 35 . It has been postulated that the missing 3ʹ acceptor stem sequence is post-translationally modified by the RNA-editing mechanism 24 . In A. aquatica mitogenome, the tRNAs are characterized by mismatched aminoacyl acceptor stem, and excepting trnS1 and trnS2 (both with only TΨC loop), the remaining tRNAs lack a TΨC arm 25 . The armless tRNA secondary structures are conserved across the family Dysderidae 36 . and T. vitiana (511 bp) is much shorter than that of T. clavata (848 bp) ( Table 1; Table S2). Spider mitogenomes with less than 800 bp for the control region include: N. nautica (455 bp) and N. doenitzi (566 bp) 31 ; E. coreana   16 ; and A. aquatica (2047 bp) 25 . The A + T content of the control region of Nephila and Trichonephila mitogenomes is AT-rich (Table 2), with negative AT skewness value in T. antipodiana and positive values in N. pilipes, T. vitiana and T. clavata (Table S3). The GC skewness value is positive for all four mitogenomes.
The control region of Nephila and Trichonephila mitogenomes is characterized by: (i) many simple tandem repeats and palindrome; (ii) long poly-nucleotide; and (iii) several stem-loop structures in these spider mitogenomes. The presence of 15 tandem repeats of ATAGA motif with TAT ATA CATAT stretch (except one each with TAT, TAT GTA CATAT, and TAT ATA CATAA) in T. clavata (Fig. 5) is a unique feature for this orb-weaving spider. Five 135-bp tandem repeats and two 363-bp tandem repeats have been identified in the putative control region of A. aquatica 25 . A long tandem repeat region comprising three full 215 bp and a partial 87 bp is present in the control region of W. fidelis mitogenome 18 .
Phylogenetic analysis. An early study based on one nuclear (18S) and two mitochondrial (COXI and 16S) markers revealed that N. pilipes and N. constricta Karsch, 1879 formed a clade that was sister to all other Nephila species 37 . This finding was supported by molecular phylogenetic study based on three nuclear and five mitochondrial genes which indicated that the genus Nephila was diphyletic, with true Nephila (containing N. pilipes and N. constricta) and the other species (now genus Trichonephila according to Kuntner et al. 1 ) being sister to the genus Clitaetra Simon, 1889 38  The present phylogenetic trees based on 13 PCGs and 15 mt-genes (13 PCGs and 2 rRNA genes) reveal identical topology with very good nodal support based on ML and BI methods (Fig. 6, Fig. S2). The genera Nephila and Trichonephila form a clade distinct from other genera of Araneidae. T. antipodiana and T. vitiana are closer related in the lineage containing also T. clavata, while N. pilipes is distinctly separated from these Trichonephila species. The araneid subfamilies Araneinae (genera Araneus, Cyclosa, Hypsosinga and Neoscona), Argiopinae (genus Argiope), Cyrtarachninae (genus Cyrtarachna) and Cyrtophorinae (genus Cyrtophora) form a clade distinct from the Nephila-Trichonephila clade.
Araneinae does not form a monophyletic group, with the genus Cyclosa being basal to the other Araneinae genera (Araneus, Hypsosinga and Neoscona), as well as the monophyletic subfamilies Argiopinae and Cyrtophorinae ( Fig. 6; Fig. S2). Argiopinae and Cyrtophorinae form a lineage distinct from the Araneinae lineages comprising Neoscona and (Araneus-Hypsosinga), Cyrtarachninae is basal to the above araneid subfamilies. A large, representative taxonomic sampling is needed to reconstruct a robust phylogeny.
Both the BI and ML trees based on two rRNA (rrnL and rrnS) sequences reveal identical clades as 15 mt-genes and 13 PCGs ( Fig. 6; Fig. S2). However, the genera Araneus and Argiope do not form monophyletic lineages, and the genus Cyclosa is the most basal genus to the other araneid genera. This result indicates that the rRNA genes alone are not suitable for reconstructing phylogeny at the higher taxonomic level.
In a recent study based on 13 protein-coding genes of the complete mitogenome, Nephilidae (represented by T. clavata) is basal to the family Araneidae 19 . Our present study, with the inclusion of N. pilipes, T. antipodiana and T. vitiana (previously N. vitiana) as well as T. clavata and additional recently published mitogenomes of Araneidae supports the Nephila-Trichonephila clade being basal to other araneid subfamilies ( Fig. 6; Fig. S2). The close affinity of T. vitiana with T. antipodiana and T. clavata indicates that it is a member of the genus Trichonephila and not Nephila as currently recognized 2 .
The close affinity between T. antipodiana and T. vitiana is also reflected by their genetic distance: 8.65% based on 13 PCGs and 8.62% based on 15 mt-genes. On the other hand, the genetic distance between T. vitiana and N. pilipes is 21.68% based on 13 PCGs and 21.56% based on 15 mt-genes. Based on 15 mt-genes, the genetic distance between Trichonephila species ranges from 8.62 to 13.41% (Table S6).
Mitochondrial genomes have been applied particularly to studies regarding phylogeny and evolution of insects 40 . A recent study on spider mitogenomes covered only 12 species of Araneidae: 1 species of Trichonephila,  Analysis of mitogenome. Raw sequence reads were obtained from the MiSeq system in FASTQ format.
The overall quality of the sequences was assessed from their Phred scores using FastQC software 43 . Ambiguous nucleotides and raw sequence reads with lower than Q20 Phred score were trimmed and removed using CLC genomic workbench v.7.0.4 (Qiagen, Germany). Quality-filtered DNA sequences were mapped against the reference mitogenome T. clavata (NC_008063), before a de novo assembly was performed on the mapped DNA sequences. Contigs larger than 13 kbp were extracted for a BLAST search against NCBI nucleotide database to identify the mitochondrial genome of the spider species 41 . On the other hand, demultiplexed raw sequence reads that were free of sequencing adapter were subjected for de novo assembly using NOVOplasty with different lengths of k-mer 44 . The assembled genomes from both softwares were aligned and examined for terminal repeats to evaluate their circularity and completeness. The mitogenome sequences of N. pilipes, T. antipodiana and T. vitiana (previously N. vitiana) have been deposited in GenBank under the accession numbers MW178204, MW178205 and MW178206, respectively.
Gene annotation, visualization and comparative analysis. The assembled mitogenomes were submitted to MITOS web-server (http:// mitos. bioinf. uni-leipz ig. de/ index. py) for an initial gene annotation 45 . The coding regions of protein coding genes (PCGs), transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs) were further validated using nucleotide-nucleotide BLAST (BLASTn) and protein-protein BLAST (BLASTp) 46 against the reference mitogenome of T. clavata (NC_008063). For tRNA genes that were not identified, we extracted the DNA sequences of their putative coding regions for an additional Infernal prediction with maximum overlap increased to 50 26 . The gene boundaries as well as the start and stop codons of PCGs were determined following multiple sequence alignment using ClustalW 47 . The overlapping and intergenic spacer regions were curated manually 21 . The nucleotide composition, amino acid frequency and relative synonymous codon usage (RSCU) in the complete mitogenomes were calculated in MEGA X 48 . The ratios of non-synonymous substitutions (Ka) and synonymous (Ks) substitutions for all PCGs were estimated in DnaSP6.0 49 . The skewness of the mitogenomes was determined from formulae: AT skew = (A − T)/(A + T) and GC skew = (G − C)/(G + C) 50 . Inverted repeats or palindromes in the control region were checked using Tandem Repeats Finder (http:// tandem. bu. edu/ trf/ trf. html) 51 . The circular mitogenomes of the spiders were visualized using Blast Ring Image Generator (BRIG) 52 .