Spider silks are the toughest known biological materials, yet are lightweight and virtually invisible to the human immune system, and they thus have revolutionary potential for medicine and industry. Spider silks are largely composed of spidroins, a unique family of structural proteins. To investigate spidroin genes systematically, we constructed the first genome of an orb-weaving spider: the golden orb-weaver (Nephila clavipes), which builds large webs using an extensive repertoire of silks with diverse physical properties. We cataloged 28 Nephila spidroins, representing all known orb-weaver spidroin types, and identified 394 repeated coding motif variants and higher-order repetitive cassette structures unique to specific spidroins. Characterization of spidroin expression in distinct silk gland types indicates that glands can express multiple spidroin types. We find evidence of an alternatively spliced spidroin, a spidroin expressed only in venom glands, evolutionary mechanisms for spidroin diversification, and non-spidroin genes with expression patterns that suggest roles in silk production.
At a glance
- Natural History Museum Bern. The World Spider Catalog, version 18.0 http://wsc.nmbe.ch/ (accessed 9 November 2016).
- Spider phylogenomics: untangling the Spider Tree of Life. PeerJ 4, e1719 (2016). et al.
- Reconstructing web evolution and spider diversification in the molecular era. Proc. Natl. Acad. Sci. USA 106, 5229–5234 (2009). et al.
- Bioprospecting finds the toughest biological material: extraordinary silk from a giant riverine orb spider. PLoS One 5, e11234 (2010). , &
- Variation in the material properties of spider dragline silk across species. Appl. Phys., A Mater. Sci. Process. 82, 213–218 (2006). , , &
- Toughness of spider silk at high and low temperatures. Adv. Mater. 17, 84–88 (2005). et al.
- Carbon nanotubes on a spider silk scaffold. Nat. Commun. 4, 2435 (2013). et al.
- Evidence for antimicrobial activity associated with common house spider silk. BMC Res. Notes 5, 326 (2012). &
- Liquid crystalline spinning of spider silk. Nature 410, 541–548 (2001). &
- Toward spinning artificial spider silk. Nat. Chem. Biol. 11, 309–315 (2015). &
- The structure and properties of spider silk. Endeavour 10, 37–43 (1986). , &
- Spider dragline silk: correlated and mosaic evolution in high-performance biological materials. Evolution 60, 2539–2551 (2006). , , &
- Pyriform spidroin 1, a novel member of the silk gene family that anchors dragline silk fibers in attachment discs of the black widow spider, Latrodectus hesperus. J. Biol. Chem. 284, 29097–29108 (2009). et al.
- Synthetic spider silk fibers spun from Pyriform Spidroin 2, a glue silk protein discovered in orb-weaving spider attachment discs. Biomacromolecules 11, 3495–3503 (2010). et al.
- Ancient properties of spider silks revealed by the complete gene sequence of the prey-wrapping silk protein (AcSp1). Mol. Biol. Evol. 30, 589–601 (2013). , , &
- Araneoid egg case silk: a fibroin with novel ensemble repeat units from the black widow spider, Latrodectus hesperus. Biochemistry 44, 10020–10027 (2005). et al.
- Modular evolution of egg case silk genes across orb-weaving spider superfamilies. Proc. Natl. Acad. Sci. USA 102, 11379–11384 (2005). &
- Molecular architecture and evolution of a modular spider silk protein gene. Science 287, 1477–1479 (2000). &
- Nephila clavipes Flagelliform silk-like GGX motifs contribute to extensibility and spacer motifs contribute to strength in synthetic spider silk fibers. Biomacromolecules 14, 1751–1760 (2013). et al.
- Variation in the chemical composition of orb webs built by the spider Nephila clavipes (Araneae, Tetragnathidae). J. Arachnol. 29, 82–94 (2001). , &
- Spider web glue: two proteins expressed from opposite strands of the same DNA sequence. Biomacromolecules 10, 2852–2856 (2009). , &
- Spider glue proteins have distinct architectures compared with traditional spidroin family members. J. Biol. Chem. 287, 35986–35999 (2012). et al.
- in Spider Ecophysiology (ed. Nentwif, W.) 283–302 (Springer 2013). &
- Small organic solutes in sticky droplets from orb webs of the spider Zygiella atrica (Araneae; Araneidae): β-alaninamide is a novel and abundant component. Chem. Biodivers. 9, 2159–2174 (2012). , , , &
- Unraveling the mechanical properties of composite silk threads spun by cribellate orb-weaving spiders. J. Exp. Biol. 209, 3131–3140 (2006). &
- Intragenic homogenization and multiple copies of prey-wrapping silk genes in Argiope garden spiders. BMC Evol. Biol. 14, 31 (2014). et al.
- Spider genomes provide insight into composition and evolution of venom and silk. Nat. Commun. 5, 3765 (2014). et al.
- Sequence conservation in the C-terminal region of spider silk proteins (Spidroin) from Nephila clavipes (Tetragnathidae) and Araneus bicentenarius (Araneidae). J. Biol. Chem. 269, 6661–6663 (1994). &
- Extreme diversity, conservation, and convergence of spider silk fibroin sequences. Science 291, 2603–2605 (2001). , , , &
- N-terminal nonrepetitive domain common to dragline, flagelliform, and cylindriform spider silk proteins. Biomacromolecules 7, 3120–3124 (2006). , , &
- Untangling spider silk evolution with spidroin terminal domains. BMC Evol. Biol. 10, 243 (2010). , &
- Advances in Insect Physiology (ed. Casas, J.) Vol. 41, 175–262 (Burlington Academic Press, 2011). , & in
- High-toughness silk produced by a transgenic silkworm expressing spider (Araneus ventricosus) dragline silk protein. PLoS One 9, e105325 (2014). , , , &
- The mechanical design of spider silks: from fibroin sequence to mechanical function. J. Exp. Biol. 202, 3295–3303 (1999). , , &
- A molecular phylogeny of nephilid spiders: evolutionary history of a model lineage. Mol. Phylogenet. Evol. 69, 961–979 (2013). , , , &
- Identification and characterization of multiple Spidroin 1 genes encoding major ampullate silk proteins in Nephila clavipes. Insect Mol. Biol. 17, 465–474 (2008). &
- Phylogenomics resolves a spider backbone phylogeny and rejects a prevailing paradigm for orb web evolution. Curr. Biol. 24, 1765–1771 (2014). et al.
- MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491 (2011). &
- Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
- Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644 (2008). , , &
- WebAUGUSTUS—a web service for training AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Res. 41, W123–W128 (2013). &
- Spider minor ampullate silk proteins contain new repetitive sequences and highly conserved non-silk-like “spacer regions”. Protein Sci. 7, 667–672 (1998). &
- Spider silk: the unraveling of a mystery. Acc. Chem. Res. 25, 392–398 (1992).
- Evidence from flagelliform silk cDNA for the structural basis of elasticity and modular nature of spider silks. J. Mol. Biol. 275, 773–784 (1998). &
- Hypotheses that correlate the sequence, structure, and mechanical properties of spider silk proteins. Int. J. Biol. Macromol. 24, 271–275 (1999). , &
- MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009). et al.
- Spider webs and silks. Sci. Am. 266, 70–76 (1992).
- Silk gene transcripts in the developing tubuliform glands of the Western black widow, Latrodectus hesperus. J. Arachnol. 38, 99–103 (2010). , , &
- Biology of spider silk. Int. J. Biol. Macromol. 24, 81–88 (1999).
- Carbonic anhydrase generates CO2 and H+ that drive spider silk formation via opposite effects on the terminal domains. PLoS Biol. 12, e1001921 (2014). et al.
- Proteomic evidence for components of spider silk synthesis from black widow silk glands and fibers. J. Proteome Res. 14, 4223–4231 (2015). , , , &
- Multi-tissue transcriptomics of the black widow spider reveals expansions, co-options, and functional processes of the silk gland gene toolkit. BMC Genomics 15, 365 (2014). et al.
- Complex gene expression in the dragline silk producing glands of the Western black widow (Latrodectus hesperus). BMC Genomics 14, 846 (2013). , , &
- From EST sequence to spider silk spinning: identification and molecular characterisation of Nephila senegalensis major ampullate gland peroxidase NsPox. Insect Biochem. Mol. Biol. 33, 229–238 (2003). , &
- A proteomics and transcriptomics investigation of the venom from the barychelid spider Trittame loki (brush-foot trapdoor). Toxins (Basel) 5, 2488–2503 (2013). et al.
- A conserved spider silk domain acts as a molecular switch that controls fibre assembly. Nature 465, 239–242 (2010). et al.
- Functional genomics reveals genes involved in protein secretion and Golgi organization. Nature 439, 604–607 (2006). et al.
- Chromosome mapping of dragline silk genes in the genomes of widow spiders (Araneae, Theridiidae). PLoS One 5, e12804 (2010). , &
- Molecular and mechanical characterization of aciniform silk: uniformity of iterated sequence modules in a novel member of the spider silk fibroin gene family. Mol. Biol. Evol. 21, 1950–1959 (2004). , &
- Early events in the evolution of spider silk genes. PLoS One 7, e38084 (2012). , , , &
- Intragenic tandem repeats generate functional variability. Nat. Genet. 37, 986–990 (2005). , , &
- Molecular origins of rapid and continuous morphological evolution. Proc. Natl. Acad. Sci. USA 101, 18058–18063 (2004). &
- Scytodes vs. Schizocosa: predatory techniques and their morphological correlates. J. Arachnol. 33, 7–15 (2005). &
- Spitting performance parameters and their biomechanical implications in the spitting spider, Scytodes thoracica. J. Insect Sci. 9, 1–15 (2009). &
- Regulation and non-toxicity of the spit from the pale spitting spider Scytodes pallida (Araneae: Scytodidae). Ethology 111, 311–321 (2005). &
- Spit and venom from Scytodes spiders: a diverse and distinct cocktail. J. Proteome Res. 13, 817–835 (2014). , , &
- Silkworms transformed with chimeric silkworm/spider silk genes spin composite silk fibers with improved mechanical properties. Proc. Natl. Acad. Sci. USA 109, 923–928 (2012). et al.
- Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014). , &
- FastUniq: a fast de novo duplicates removal tool for paired short reads. PLoS One 7, e52249 (2012). et al.
- High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl. Acad. Sci. USA 108, 1513–1518 (2011). et al.
- SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1, 18 (2012). et al.
- Metassembler: merging and optimizing de novo genome assemblies. Genome Biol. 16, 207 (2015). &
- Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience 2, 10 (2013). et al.
- BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015). , , , &
- Full-length transcriptome assembly from RNA–Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011). et al.
- De novo transcript sequence reconstruction from RNA–seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013). et al.
- GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005). &
- STAR: ultrafast universal RNA–seq aligner. Bioinformatics 29, 15–21 (2013). et al.
- Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–467 (2005). et al.
- Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
- Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6, 31 (2005). &
- tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997). &
- UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).
- Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990). , , , &
- The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res. 43, D213–D221 (2015). et al.
- Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649 (2012). et al.
- Intrahost dynamics of antiviral resistance in influenza A virus reflect complex patterns of segment linkage, reassortment, and natural selection. MBio 6, e02464–14 (2015). et al.
- Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One 7, e47768 (2012). et al.
- Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinformatics 13, 238 (2012). &
- Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv https://arxiv.org/abs/1303.3997 (2013).
- The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). et al.
- A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).
- The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011). et al.
- On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7, 256–276 (1975).
- Reference gene selection for insect expression studies using quantitative real-time PCR: the head of the honeybee, Apis mellifera, after a bacterial challenge. J. Insect Sci. 8, 1–10 (2008). et al.
- Analysis of relative gene expression data using real-time quantitative PCR and the 2−ΔΔCT method. Methods 25, 402–408 (2001). &
- Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009). et al.
- Supplementary Figure 1: The golden orb-weaver spider’s morphology, reported silk gland anatomy, and web construction. (641 KB)
(a) Photographs of N. clavipes showing an adult female at the center hub of her orb web (left) and a view of the spinneret silk-extruding organs on the underside of the female abdomen (right). (b) Silk gland anatomy of N. clavipes, showing the seven different female araneoid gland morphologies found in the abdomen and the different classes of silk proteins produced. Each silk class has specific physical characteristics; for example, the minor and major ampullate spidroins produce silks with great tensile strength, flagelliform silk has great extensibility, aggregate silks are non-fibrous stick glue, etc. This illustration (inspired by ref. 52) exhibits one set of silk glands and spinnerets from a bilateral pair, and indicates that each gland type produces a specific type of silk. However our expression data (Fig. 5a) suggest that this is not the case, supporting previous findings48, 50, 53 that individual glands can express multiple classes of spidroins. Note: the gland type coloration scheme and corresponding silk use pictograms defined here are used in later figures. (c) Putative applications of spider silk types in web construction (web diagram adapted from ref. 54), as described in previous studies. (i) Web building and maintenance: major ampullate silk is used for bridgelines and web radii; minor ampullate silk is used for temporary spiral; piriform attaches fibers together and to substrates; flagelliform is used for the capture spiral; aggregate silks are sticky, aiding in adherence and prey capture. (ii) Prey wrapping: aciniform (top inset photo). (iii) Silk egg casings: tubuliform (bottom inset photo). References for silk classes and their purported uses are listed in the main text. (Photos provided by P.L.B.)
- Supplementary Figure 2: Maximum-likelihood phylogenetic gene tree of 28 N. clavipes spidroins in the context of 55 spidroins from other spider taxa. (107 KB)
The spidroin gene tree is rooted with a Bothriocyrtum californicum fibroin sequence (B.c. fibroin1; accession HM752562) and is based on multiple-sequence alignment (MSA) using the first ~130 amino acid residues of the N-terminal domain encoded by each gene. MSA was performed with Geneious, Clustal, and BLOSUM62, and the consensus tree was built with PhyML (Supplementary Note). Bootstrap proportions >50 (based on 1,000 replicates) are shown to the left of their respective nodes. Colors follow the gland/spidroin class designations shown in Supplementary Figure 1. Codes and accession numbers for different spidroins and taxa are listed in Supplementary Table 10.
- Supplementary Figure 3: Maximum-likelihood phylogenetic gene trees for the catalog of 28 spidroins identified in N. clavipes. (165 KB)
(a,b) Unrooted maximum-likelihood phylogenic trees for the catalog of 28 spidroins identified in N. clavipes, shown as both transformed (a) and non-transformed (b) layouts. Both trees are based on a multiple-sequence alignment (MSA) using the first ~130 amino acid residues of the N-terminal domain for each N. clavipes spidroin. MSA was performed with Geneious, Clustal, and BLOSUM62, and the consensus tree was built with PhyML (Supplementary Note). Bootstrap proportions >50 (based on 1,000 replicates) are shown to the left of their respective nodes. Colors follow the gland/spidroin class designations shown in Supplementary Figure 1. Codes for N. clavipes spidroins are listed in Supplementary Table 12.
- Supplementary Figure 4: Agarose gel images of long-range PCR–amplified MiSp sequences used for validation of draft assembly, scaffold bridging, and gap closure. (165 KB)
The top panel highlights a single lane with an LR-PCR reaction (golden rectangle) for MiSp-c. The bottom panel highlights four lanes with LR-PCR reactions (golden rectangle) for MiSp-d. In both cases, multiple large bands are visible, indicating amplification of multiple targets that presumably represent genomic regions with high sequence similarity to the binding sites of the oligonucleotide primers used to isolate both MiSp types.
- Supplementary Figure 5: Distribution of amino acid frequency for N. clavipes ‘gold’ gene models. (138 KB)
Amino acid frequency distributions were calculated for all 20 amino acids for all mRNA transcripts from the gold gene model set (n = 17,989 mRNA sequences). Several spidroins were found at the extreme ends of the individual amino acid distributions (Supplementary Fig. 5 and Supplementary Note). Box-and-whisker plots show the range of frequency values (y axis) for the given amino acid residues (x axis). Thick black center lines represent median values. Upper whiskers represent the largest observation ≤ the upper quartile (Q3) + 1.5 interquartile range (IQR), and lower whiskers represent the smallest observation ≥ the lower quartile (Q1) – 1.5 IQR.
- Supplementary Figure 6: Distribution of amino acid frequency for 28 N. clavipes spidroin genes. (100 KB)
Amino acid frequency distributions were calculated for all 20 amino acids for all N. clavipes spidroin genes (n = 28 sequences). Box-and-whisker plots show the range of frequency values (y axis) for the given amino acid residues (x axis). Thick black center lines represent median values. Upper whiskers represent the largest observation ≤ the upper quartile (Q3) + 1.5 interquartile range (IQR), and the lower whiskers represent the smallest observation ≥ the lower quartile (Q1) – 1.5 IQR. Overall, spidroins exhibit enrichment of alanine, glycine, and serine residues, which have significantly different proportions when compared to 17,989 mRNA sequences from the gold gene set (Wilcoxon rank-sum test; Supplementary Fig. 5 and Supplementary Note). **P < 0.01.
- Supplementary Figure 7: Shared and private motif occurrences in N. clavipes spidroins. (137 KB)
Bar graph comparing the number of shared (gold) versus private (dark gray) distinct repetitive motif occurrences observed in the different N. clavipes spidroins (n = 28 sequences).
- Supplementary Figure 8: Shared and private cassette occurrences in N. clavipes spidroins. (122 KB)
Bar graph comparing the number of shared (gold) and private (dark gray) distinct repetitive cassette occurrences observed in the different N. clavipes spidroins (n = 28 sequences).
- Supplementary Figure 9: RNA–seq expression patterns of spidroin genes in 13 N. clavipes tissue samples. (267 KB)
Heat map showing the absolute number of normalized RNA–seq reads that map to spidroin transcripts, assayed in ten individual silk glands, one venom gland isolate, and two brain isolates collected from two non-gravid females, Nep-008 and Nep-009. Owing to extensive sequence similarity between MaSp-b and MaSp-c, it was not possible to distinguish between reads that mapped to these two spidroins; thus, data for these two transcripts are presented together as “MaSp-b,c”. Reads mapping to MaSp-h and AgSp-c exceeded the heat map’s informative range; thus, we have included bar graph insets (right) confirming that reads mapping to MaSp-h (top inset) and AgSp-c (bottom inset) are substantially more abundant in silk glands than in venom gland or brain.
- Supplementary Figure 10: Distributions of relative expression values for 29 N. clavipes genes in seven tissue types. (322 KB)
Box-and-whisker plots of the relative expression for all 28 N. clavipes spidroin genes and 1 venom gene (PR-1) in tissue dissections (n = 3 independent-specimen biological replicates per tissue) assayed by qPCR. Tissues included venom glands, five anatomically distinct silk glands, and ‘other’ silk glands (aciniform and piriform glands attached to the spinneret) and are shown left of the y axis, whereas relative expression (2−ΔΔCT method46) is depicted on the y axis (log10 scale) organized in rows by tissue type. Box-and-whisker plots show the range of expression values for the given genes (x axis) relative to RPL13a (housekeeping gene) expression and normalized to leg tissue. Thick black center lines represent median values. Upper whiskers represent the largest observation ≤ the upper quartile (Q3) + 1.5 interquartile range (IQR), and the lower whiskers represent the smallest observation ≥ the lower quartile (Q1) – 1.5 IQR. Asterisks indicate a single silk gland type exhibiting significantly greater expression values for a given gene versus all other silk gland types together (Wilcoxon rank-sum test). **P < 0.01.
- Supplementary Figure 11: Mean relative expression values of 29 N. clavipes genes in seven tissue types. (198 KB)
Heat map showing the relative expression of N. clavipes spidroin loci in tissue dissections (n = 3 biological replicates per tissue) assayed by qPCR. Tissues included venom glands, five anatomically distinct silk glands, and ‘other’ silk glands (aciniform and piriform glands attached to the spinneret) and are shown on the x axis, with spidroins arranged on the y axis. The heat map panels depict relative mean fold change in gene expression (2−ΔΔCT method46) per tissue (distinct tissue dissections from n = 3 individuals) over RPL13a and normalized to leg tissue.
- Supplementary Figure 12: RNA–seq expression patterns of SSTs in 13 N. clavipes tissue samples. (415 KB)
Heat map showing the absolute number of normalized reads that map to 649 non-spidroin silk gland–specific transcripts (SSTs), assayed in ten individual silk glands, one venom gland, and two brain isolates collected from two non-gravid females, Nep-008 and Nep-009. SSTs are vertically clustered based on the filtering method used for discovery (Supplementary Note), as noted by colored vertical bars at the right of the heat map. The categories defined on the left are described in Supplementary Table 15.
- Supplementary Figure 13: Polymorphism levels of genes and genic features in the N. clavipes genome. (170 KB)
(a) Box-and-whisker plot comparing the distribution of θW values43 derived (from SNP counts) for 14,025 gold gene sequences in comparison to the distribution of θW values for 28 N. clavipes spidroins. Box-and-whisker plots show the range of θW values for each gene set. Thick black center lines represent median values. Upper whiskers represent the largest observation ≤ the upper quartile (Q3) + 1.5 interquartile range (IQR), and the lower whiskers represent the smallest observation ≥ the lower quartile (Q1) – 1.5 IQR. Asterisks indicate the 28 N. clavipes spidroin genes that exhibit significantly greater θW values than the collected gold gene set (Wilcoxon rank-sum test; Supplementary Note). **P < 0.01. (b) Vertical bar graph showing the mean θW values for 11 genomic feature categories, including many gold gene model subfeatures, in comparison to the mean θW values for N. clavipes spidroins, silk N termini, and silk C termini. (c) Bar graph depicting the θW values for individual N. clavipes spidroins.
- Supplementary Text and Figures (4,928 KB)
Supplementary Figures 1–13, Supplementary Tables 1–12 and Supplementary Note