High-yield Production of Amyloid-β Peptide Enabled by a Customized Spider Silk Domain

During storage in the silk gland, the N-terminal domain (NT) of spider silk proteins (spidroins) keeps the aggregation-prone repetitive region in solution at extreme concentrations. We observe that NTs from different spidroins have co-evolved with their respective repeat region, and now use an NT that is distantly related to previously used NTs, for efficient recombinant production of the amyloid-β peptide (Aβ) implicated in Alzheimer’s disease. A designed variant of NT from Nephila clavipes flagelliform spidroin, which in nature allows production and storage of β-hairpin repeat segments, gives exceptionally high yields of different human Aβ variants as a solubility tag. This tool enables efficient production of target peptides also in minimal medium and gives up to 10 times more isotope-labeled monomeric Aβ peptides per liter bacterial culture than previously reported.

for structural studies of amyloid fibrils, but also for certain in vivo experiments, the availability of large quantities of isotope-labeled Aβ is essential.
Studies of Aβ aggregation in vitro have often been conducted with synthetically produced peptides 18,19 . Synthetic preparations have several drawbacks including batch-to-batch variations, intrinsic impurities and relatively high cost, especially for isotope labeling. As a consequence, several recombinant expression systems have been established. These production protocols either result in peptides with an initiating non-native methionine residue 20,21 or are based on solubility tags that require proteolytic cleavage to obtain the native human Aβ sequence [22][23][24] . The main disadvantage of having methionine as the first residue is that it might affect processes such as posttranslational modifications, e.g. pyroGlu formation [25][26][27] , and metal ion binding since the metal ion-binding site is located in the N-terminus [28][29][30] . Here we describe a useful solubility tag for production of aggregation-prone proteins and peptides, and demonstrate that this tool enables very efficient production of native and isotope-labeled Aβ peptides.

Results and Discussion
Evolutionary relationships of NT and repetitive regions of different spidroins. A phylogenetic tree based on sequence alignment of 67 NTs found in GenBank ( Supplementary Fig. S1) reveal evolutionary relationships between NT and their respective repetitive regions (Fig. 1A). The NT domains cluster according to silk type, as previously reported 7 . Hence, the NTs of different spidroin types, which are defined by the nature of their respective repetitive regions, have been conserved through evolution of different spider species. Structural characteristics of the repetitive regions appear to co-vary with the evolution of NTs, e.g. for the tubuliform (TuSp) and aciniform (AcSp) NTs, which are evolutionarily close, the corresponding repetitive regions stand out by forming globular folded domains 31,32 (Fig. 1A). NT from FlSp is linked to a unique repetitive region that contains several embedded spacers (each 27 residues), which are predicted to form β-hairpins 33 (Fig. 1A). NT FlSp exhibits distant evolutionary relationship (<35% sequence identity) to the previously reported NT MaSp 6 ( Fig. 1B) and MaSps contain repetitive regions with predicted α-helical and random coil structures [34][35][36][37][38] . We speculate that different NTs may have evolved to facilitate optimal solubility of their respective repeat region in the silk gland during storage conditions, where pH is neutral and NT monomeric 4,5 . Irrespective of any potential evolutionary co-variation between NTs and the repetitive regions, we aimed to explore whether NT FlSp could work in protein expression in an equivalent way to the previously investigated and distantly related NT MaSp 6 . Design of the novel solubility tag nt* flSp . To prevent dimerization of NT FlSp at low pH we introduced the D40K and K65D mutations 6 in NT FlSp from Nephila clavipes (Nc) (Fig. 1B) (numbering as described previously 6 , wherefore the mutations correspond to positions 36 and 60 in Fig. 1B). NT* FlSp has a larger number of charged residues (25 vs. 11) and stronger net charge (−7 vs. −5) compared to NT* MaSp , which potentially enhance its solubility properties. In contrast to NT* MaSp , NT* FlSp has no tryptophan, whose absorbance at 280 nm would cover the intrinsic low absorbance at 280 nm of the target peptide Aβ. Thus, for size exclusion chromatography (SEC), where detection relies on the protein absorbance at 280 nm, NT* FlSp enables clearly separated intensity peaks.
Efficient expression and purification of Aβ monomers using nt* flSp . We designed the fusion protein NT* FlSp -Aβ by fusing the genetic codes of the solubility tag NT* FlSp and Aβ with a TEV recognition site in-between (Fig. 1C). An overview of the expression and purification scheme is given in Fig. 2. The fusion protein was expressed in BL21(DE3) E.coli cells grown in rich or minimal medium, dissolved in 8 M urea after cell lysis and purified using immobilized metal ion affinity chromatography (IMAC). Urea was added as denaturant to increase binding to IMAC column. For optimal cleavage of the fusion protein by TEV, a buffer exchange was conducted, either by overnight dialysis or by column chromatography. TEV cleavage can alternatively be conducted during buffer dialysis to speed-up the purification, yet a short dialysis step to decrease the urea concentration below 2 M is recommended before the addition of TEV protease. Finally, the solution was applied to SEC with a Superdex 30 column, whereby monomeric Aβ monomers were isolated.
The expression and purification protocol presented here results in highly pure Aβ 40 and Aβ 42 monomers within 2.5-4.5 days, depending on buffer exchange and cleavage method. The yields in rich and minimal medium are listed in Table 1. The example shown in Fig. 2 represents purifications from 100 and 500 mL culture medium, yielding very similar amounts of 37 ± 7 mg of pure Aβ 42 monomers if extrapolated to one-liter culture. Notably, the present scheme gives by far the highest yields, both in rich and minimal medium, compared to other reported protocols ( Table 2).
The purified peptides were investigated using mass spectrometry, confirming the expected masses for Aβ, here shown for Aβ 40 (Fig. 3A). Using 13 C-15 N-double-labeled Aβ 40 and Aβ 42 we performed nuclear magnetic resonance (NMR) experiments to confirm the purity and structural state of the purified peptides. We recorded 1 H-15 N-HSQC experiments ( Fig. 3B and Supplementary Fig. S3) where the chemical shifts of the cross-peaks coincide with previous assignments reported in the literature, revealing a monomeric, predominantly unstructured conformation of the purified Aβ peptides 39,40 . To analyze the secondary structure of monomeric Aβ 42 we applied circular dichroism (CD) spectroscopy. The initial CD spectra (Fig. 3C) indicated a predominantly unstructured conformation as previously reported 20,21,41,42 . Taken together, these experiments confirm that our method results in monomeric Aβ 40 and Aβ 42 peptides.

Production of 4-fluoro-Phe-labeled Aβ peptides. The present approach also opens new opportunities
for NMR studies that require more complex isotope labeling approaches associated with reduced protein yields. For example, we have used the NT* FlSp tag to produce monomeric Aβ 42 incorporating 4-fluoro-Phe (4FF-Aβ 42 ) in milligram yields. The expression was performed similarly as described above, but glyphosate and DL-tyrosine www.nature.com/scientificreports www.nature.com/scientificreports/ Aggregation kinetics of native and isotope-labeled Aβ 42 . To ensure that the isolated peptides behave as expected, we investigated the aggregation kinetics starting from monomeric Aβ peptides. Recording CD signals under continuous stirring at 37 °C of 10 μM Aβ 42 , a structural conversion from an unstructured to a β-structured conformation was observed (Fig. 3C), where the isodichroic point at 208 nm indicates a two-state transition. Furthermore, we used 50 μM 4FF-Aβ 42 for real-time aggregation 19 F-NMR studies at 25 °C, revealing a www.nature.com/scientificreports www.nature.com/scientificreports/ decrease of 4FF-signals over time (Fig. 3D). The signal loss can be fitted to a sigmoidal decline, with an aggregation half time of 258 ± 5 min under the conditions used.
Alternatively, Aβ aggregation kinetics can be monitored using the fluorescence dye thioflavin T (ThT), for a detailed elucidation of the nucleation mechanism. Here, we conducted ThT experiments on Aβ 42 in 20 mM sodium phosphate buffer, pH 8.0, at 37 °C under quiescent conditions at different initial Aβ 42 monomer concentrations, [Aβ] (Fig. 4). The final fluorescence intensity exhibits a linear dependence on the initial monomer concentration (Fig. 4D), suggesting that the total amount of initially monomeric peptides forms ThT-active fibril material, as previously shown for Aβ 12,13,42 . The aggregation half times of Aβ 42 used here exhibit a simple relation τ 1/2 ∝ [Aβ] γ , with γ = −1.0 ± 0.1, corresponding to the slope in a double logarithmic plot (Fig. 4C). This value is in the same range as found for Aβ M42 with an initial methionine, where γ = −1.3 was reported 12 . For γ = −1.0 a multi-step secondary nucleation model describes better the observed aggregation traces compared to a single-step secondary nucleation model (Fig. 4A,B). The multi-step model additionally includes saturation of secondary nucleation and was previously shown to be applicable for the shorter Aβ 40 and Aβ M40 variants 13,29,42 and for Aβ M42 at pH 7.4 43 , which all exhibit higher γ-values, but also describes well the kinetics of Aβ M42 at pH 8.0 44 . Hence, these results confirm that the native and isotope-labeled peptides obtained herein are highly pure and in a monomeric state, which is essential for accurate and reproducible aggregation kinetics experiments.
conclusions Taken together, we have developed a biomimetic tool that provides facile, fast and inexpensive production of pure and monomeric Aβ 40 and Aβ 42 peptides. The high yield obtained also in minimal medium enables efficient generation of isotope-labeled Aβ peptides. Peptides produced by our protocol recapitulate the behavior of Aβ peptides obtained by other means, which indicate the applicability of using NT* FlSp for generating functional Aβ peptides. The NT* FlSp -tag holds great potential, also when compared to NT* MaSp 6 , for efficient production of medically relevant aggregation-prone peptides and proteins. This is important since the majority of new pharmaceuticals are biologics and facile protocols for efficient production of proteins that are difficult to produce are needed.

Methods
Expression and purification protocol. The synthetic gene coding for NT FlSp from Nephila clavipes with the D40K and K65D mutations (NT* FlSp ) was ordered from GenScript (GenScript Biotech, Netherlands). The NT* FlSp gene was ligated into pT7 plasmid containing TEV recognintion site (TRS)-Aβ40/Aβ 42    www.nature.com/scientificreports www.nature.com/scientificreports/ previously 6 . The plasmids were transformed into chemically competent E. coli BL21 (DE3) cells and expressed as described previously 45 . In short, 1 mL overnight culture was inoculated to 100 mL LB medium (1/100) or 100 mL M9 overnight culture was inoculated to 1 L M9 minimal medium (10/100) with 70 mg/l kanamycin. Cells were grown at 30 °C at 120 RPM to OD 600nm around 0.8-0.9, where the temperature was lowered to 20 °C, and 0.1 mM Isopropyl β-D-1-thiogalactopyranoside (IPTG) was added and the cells were incubated overnight. To isolate the cells from media, the bacterial culture was centrifuged at 5,000 × g for 20 minutes at 4 °C and the cell pellets  (Fig. 2C). The correct size of Aβ, NT* FlSp and TEV was confirmed by SDS/PAGE in a 4-20% polyacrylamide gel, stained with Coomassie brilliant blue dye (Fig. 2B). For expression of 15 N-and 13 C-labelled NT* FlSp -Aβ, the same procedure was used except that M9 minimal medium containing 15 NH 4 Cl (1 g/L M9) and 13 C-glucose (4 g/L M9) was used. The plasmid pRK793 for TEV expression was obtained from addgene (addgene.org, deposited by David S. Waugh) and was expressed as described above and purified as described previously 46 .

Determination of yields.
Both NT* FlSp -Aβ 40 and NT* FlSp -Aβ 42 was transformed into BL21 (DE3) E. coli cells and spread onto an agar plate with kanamycin. 5 starting cultures of LB and M9 were inoculated with individual colonies and incubated at 31 °C overnight. The expression was performed as described above in 100 mL LB www.nature.com/scientificreports www.nature.com/scientificreports/ and M9 media. 100 μL of each culture was taken before and after induction, lyophilized, dissolved in SDS loading buffer and boiled for 10 minutes at 96 °C. 1 μL of each induced sample and 1 μL uninduced sample from each condition was loaded on a 4-20% mini protean TGX gel (BioRad) and blotted on a PVDF membrane (GE healthcare). 5% w/v non-fat dry milk/PBS was used to block the membrane after blotting for 1 h, followed by incubation with 6E10 primary antibody in 5% w/v non-fat dry milk, 0.1% Tween/PBS overnight at 4 °C. The membranes were washed three times with 0.1% Tween/PBS, and ECL anti-mouse secondary antibodies in 5% w/v non-fat dry milk and 0.1% Tween/PBS were added for 1 h at room temperature. Enhanced chemiluminescence detection reagent (GE Healthcare) was added and images were acquired using an AI600 imaging system (GE healthcare). The concentration of each sample was calculated by integration of the peaks from IMAC (fusion protein) and SEC (monomeric Aβ) with an extinction coefficient ε 280 = 2,980 M −1 cm −1 for the fusion protein and 1,424 M −1 cm −1 for Aβ. Western blot intensities were analyzed by ImageJ software 47 and average and standard deviation from 5 replicates was calculated using yields from one full purification of each condition. Values are listed in Table 1.
Expression protocol of 4FF-Aβ42. The plasmid pT7His6NT* FlSp -TEV recognition site -Aβ 42 was transformed into chemically competent E. coli BL21(DE3) cells. Colonies were inoculated to 10 mL LB medium with 70 mg/L kanamycin and grown at 30 °C and 200 r.p.m. to OD 600nm > 1.0. 0.5 mL day culture was inoculated to 25 mL M9 medium with 70 mg/l kanamycin and grown at 30 °C and 200 r.p.m. overnight. 10 mL overnight culture was inoculated to 1 L M9 medium and cells were further grown at 30 °C. Uniform labeling with 4-fluorophenylalanine was achieved by the introduction of 1 g/L glyphosate and 75 mg/L DL-tyrosine to shaking bacterial cultures at 30 °C which had reached an OD 600nm of 0.6. Once cells achieve an OD 600nm of 0.8, 30 mg/L DL-4-fluorophenylalanine was added. The incubation temperature was lowered to 20 °C and expression was induced with the addition of IPTG to 0.1 mM, the cells were incubated overnight and were harvested by 7,000 × g centrifugation at 4 °C.
Evolutionary relationships of the NT domains of different spidroins. The evolutionary history of the NT domains from different spidroins was inferred by the Neighbor-Joining method with the Poisson correction. Evolutionary analyses were conducted in MEGA7 48 . The analysis involved 67 amino acid sequences. In the spider silk gland (liquid protein), the repetitive region of MaSp, consisting of GGX, polyA, GX and GPGQQ, is disordered and partially helical [34][35][36][37][38] , and MiSp and FlSp share identical motifs 33 . However, there are ∼127-aa spacer in MiSp, which adopt α-helical conformation, whereas the 27-aa spacer in FlSp is predicted to fold to β-hairpin 33 . The large repetitive domains of AcSp and TuSp adopt α-helical conformation 31,32 , and the repetitive domains of PySp is also predicted to adopt α-helical conformation by PSIPRED v3.3 (http://bioinf.cs.ucl.ac.uk/ psipred/). nuclear magnetic resonance (nMR). 1 H-15 N HSQC spectra were recorded on a 500 MHz or 700 MHz Bruker Avance spectrometer equipped with cryogenic probes. The HSQC spectrum of Aβ 40 was recorded at 500 MHz at 8 °C using 75 μM peptide concentration in 16 mM sodium phosphate buffer, pH 7.4, with 0.02% NaN 3 and 0.2 mM EDTA. For Aβ 42 the peptide concentration was 15 μM in 20 mM sodium phosphate buffer, pH 6.8, recorded at 5 °C and 700 MHz. The spectra were recorded using 2048 × 128 complex points and 32 scans per transient. For Aβ 42 we recorded the HSQC at 15 μM directly after the SEC purification, ensuring the monomeric state of the peptide. 19 F-NMR experiments were recorded using 50 μM 4FF-Aβ 42 in 20 mM sodium phosphate buffer, pH 7.4 with 0.03% NaN 3 and 1 mM EDTA at 25 °C and 565 MHz. 19 F spectrum was acquired with 512 transients and 1.0 s pulse delay between each transient. Line broadening of 1.0 Hz was used to process the final spectrum. The 1 H-15 N HSQC spectrum of 15 μM 4FF-Aβ 42 in 20 mM sodium phosphate buffer, pH 7.4, with 0.02% NaN 3 and 0.2 mM EDTA, was recorded at 4 °C on a 600 MHz Bruker Avance Neo spectrometer equipped with a cryogenic probe. circular dichroism (cD). CD measurements of 10 μM Aβ 42 in 20 mM sodium phosphate buffer, pH 8.0, at 37 °C were performed in a quartzglass Suprasil 10 × 4 mm CD cuvette (Hellma Analytics) where the optical path length was 4 mm, using a Chirascan CD spectrometer (Applied Photophysics). A resolution of 1.0 nm and a bandwidth of 1 nm were chosen for the aggregation kinetics experiments 42 . During the enire measurement the sample was continuously stirred at around 1200 r.p.m and each 3 min a new CD spectrum was recorded to follow the aggregation kinetics.

Thioflavin T (ThT) fluorescence kinetics experiments.
For ThT aggregation kinetics experiments 1 to 9 μM monomeric Aβ 42 was used, which was obtained after SEC purification 45 . ThT fluorescence was measured as described previously 45 using 96-well microplates, where each well contained 80 μl sample solution with 10 μM ThT.
Analysis tht aggregation kinetics. Aggregation traces were first analyzed using a fit to a sigmoidal function, revealing the aggregation half time, τ 1/2 29,42,45 . Subsequently, the aggregation traces were normalized and averaged over six replicates. The averaged aggregation half times are related to the initial monomer concentration, [Aβ], by τ 1/2 ∝ [Aβ] γ where γ reflects the slope in a double-logarithmic plot (Fig. 4C). Further, we applied a