Bottlenecks in metabolic pathways due to insufficient gene expression levels remain a significant problem for industrial bioproduction using microbial cell factories. Increasing gene dosage can overcome these bottlenecks, but current approaches suffer from numerous drawbacks. Here, we describe HapAmp, a method that uses haploinsufficiency as evolutionary force to drive in vivo gene amplification. HapAmp enables efficient, titratable, and stable integration of heterologous gene copies, delivering up to 47 copies onto the yeast genome. The method is exemplified in metabolic engineering to significantly improve production of the sesquiterpene nerolidol, the monoterpene limonene, and the tetraterpene lycopene. Limonene titre is improved by 20-fold in a single engineering step, delivering ∼1 g L−1 in the flask cultivation. We also show a significant increase in heterologous protein production in yeast. HapAmp is an efficient approach to unlock metabolic bottlenecks rapidly for development of microbial cell factories.
To achieve economically viable rates, yields and titres for a given product in microbial cell factories, it is commonly necessary to increase expression of introduced genetic constructs1,2. This is typically achieved by manipulating transcription levels via transcriptional control elements (promoters and other genetic sequences)3. However, this approach is subject to thresholds on individual constructs. This often means that expression levels are insufficient for a desired application. For example, enzymes with poor catalytic properties that cannot be improved by enzyme engineering represent significant flux bottlenecks in metabolic engineering4. In addition, where extremely high product levels are required (e.g., protein production systems), very high expression can deliver a direct economic benefit to the bioprocess. Increasing the gene dosage can be used to overcome transcriptional thresholds and increase expression levels.
The brewer’s yeast Saccharomyces cerevisiae is a eukaryotic model organism and an important industrial microorganism for production of biofuels, biochemicals, and biopharmaceuticals. In S. cerevisiae, multi-copy yeast episomal plasmids or genome integration into ribosomal DNA (rDNA) sites are typically used to increase gene dosage5,6,7,8. However, these approaches are not stable in the absence of selection pressure, and plasmids can suffer from copy number instability leading to variable expression levels5,6,7,8. In addition, use of selection systems in industrial processes adds additional costs and often is not scalable9,10. To stabilise strains without the need for selective antibiotic or auxotrophy systems, auto-selection markers such as glycolytic genes (FBA1, fructose-bisphosphate aldolase; POT1/TPI1, triosephosphate isomerase) can be used5,11,12. However, this requires the background strains to have the correct genotype for knock-out. Transposable elements can also be used for multi-copy integration, however variable copies are integrated at random loci on genome, which means integrated components cannot be removed to facilitate future engineering steps (for example, swapping terpenoid synthases for different terpenoid production platforms)13,14,15,16,17. A method overcoming all these limitations is highly desirable.
Gene amplification commonly happens in nature during cell proliferation, as part of molecular evolution, as well as in some laboratory experiments2,18,19,20,21,22,23. In yeast, tandem amplification of fitness-associated genes on the genome permits improved survival and propagation of cells under new or changing conditions18,19,20. For example, amplification of the xylose isomerase, cellobiose-utilisation, and copper resistance (CUP1) genes occurs over prolonged adaptive cultivation on xylose19,20, cellubiose24, and copper ions25, respectively. Another example is the amplification of tandem repeated rDNA under some conditions26. These examples demonstrate that if the expression level of a gene product is tightly linked to growth fitness and cannot meet the needs for maximum growth, gene amplification can occur through adaptive evolution.
In diploids, haploinsufficiency describes a state whereby one allele at a heterozygous locus provides little or no product, and the combined product from both alleles is insufficient to deliver the wild type phenotype27. Expression dosage of haploinsufficient genes links tightly with the growth fitness in yeast28. This can be explored as an evolutionary force to drive gene amplification and as a selection pressure for maintenance of the amplified constructs under normal cultivation conditions.
Here, we design an artificial genetic structure that enables amplification of a haploinsufficient gene through tuning of its promoter strength or translational efficiency (HapAmp). This structure is incorporated into genetic vectors which can be used to introduce multiple copies of linked heterogeneous genes on the genome. We exemplify the applications of this technique by developing yeast factories for improved production of terpenes by metabolic engineering and for high production of pharmaceutically relevant proteins.
Construct design for in vivo gene amplification
Two elements are required for gene amplification to occur: (1) a gene linked to cell fitness, and (2) homologous DNA sequences to support recombination20. In addition, a strong replication origin can promote amplification29,30,31. These three elements exist in tandem repeat in the rDNA region and the CUP1 region in the yeast genome (Fig. 1a).
We designed a genetic structure for gene amplification in yeast (Fig. 1b). The construct has recombination arms at each end. Arm 1 is homologous to the promoter region of a haploinsufficient gene, and Arm 2 is homologous to the initial part of the haploinsufficient gene open reading frame. This allows insertion of the construct into the genome by homologous recombination. Downstream of Arm 1 are a selectable marker for transformation selection and homologous Arm 3, which is homologous to the terminator region of the haploinsufficient gene. Between Arm 3 and Arm 2, there are an autonomous replicating sequence (ARS) and a promoter. The promoter is weaker than the native promoter of the haploinsufficient gene and positioned such that integration results in substitution of the native promoter of the haploinsufficient gene with the weaker promoter. Genes of interest, to be expressed heterologously, can be inserted between Arm 3 and the weaker promoter.
Driving expression through a weaker promoter attenuates the protein yield from each copy of the haploinsufficient gene. This, in turn, is expected to decrease the growth rate in yeast. Native amplification of the region between homologous Arm 3 will then occur as yeast evolves towards faster growth.
Using RPL25 or SEC23 haploinsufficient gene loci to drive amplification
The effect of haploinsufficient genes on growth fitness has been characterised previously28. We used the ribosomal 60S subunit protein L25 (RPL25) and the SEC23-encoding component of the Sec23p-Sec24p heterodimer of the COPII vesicle coat. These two genes have the strongest fitness effect in rich medium and in minimal mineral medium28. We developed four constructs with RPL25 as the driving gene, LEU2 as selection marker, and an early-firing ARS ARS30632 to facilitate amplification; and three constructs with SEC23 as the driving gene, hygromycin B resistant gene hphMX as selection marker, and the strong ARS1max ARS33 to facilitate amplification (Fig. 2a).
To identify promoters with suitable expression strengths, promoters were selected from the wide variety of promoters we previously analysed34, to test with each target locus (Fig. 2a, d). For the RPL25 constructs we used the YEF3 promoter (which has similar strength to the RPL25 promoter; Construct 1) and the ERG1, PDA1, or BTS1 promoters (all with multiple-fold weaker expression than RPL25 promoter; Constructs 2–4). For the SEC23 constructs, we used the ERG1 promoter (stronger than the SEC23 promoter; Construct 5), the GLO2 promoter, or the COG7 promoter (both multiple-fold weaker than the SEC23 promoter; Constructs 6 and 7). An eighth promoter construct was designed and tested later (see below). We used yeast-enhanced green fluorescent protein (yEGFP) under the control of the TEF1 promoter and the URA3 terminator as the gene of interest and as a reporter for proof of concept.
The seven constructs were transformed into S. cerevisiae CEN.PK strains. Transformation plates were screened by imaging yEGFP fluorescence under blue light (Supplementary Fig. 1a, c) and colonies were selected for increased fluorescence. For each construct, six strongly fluorescing clones were selected. Visual observation after sub-culturing demonstrated an inverse correlation between promoter strength (Fig. 2d) and GFP fluorescence (Supplementary Fig. 1b). Three clones with similar fluorescence were selected for quantitative characterisation for each construct.
Where promoter strength was similar or greater than the native promoter, yEGFP was found at a single copy on the genome (Fig. 2c: Constructs 1 and 5), and fluorescence (Fig. 2e: Constructs 1 and 5) was similar to fluorescence we observed previously in strains with a single copy of the PTEF1-yEGFP-TURA3 construct3. yEGFP gene copy number and fluorescence both increased where the native promoter was substituted for weaker promoters (Fig. 2c, e: Constructs 2–4, 6, 7). Copy number increased from 4-fold to 47-fold, whereas fluorescence increase was 4-fold to 92-fold. There was a strong positive correlation between copy number and fluorescence (r2 = 0.985), and a weak negative correlation between fluorescence and promoter strength/copy number (r2 = 0.376 and 0.694 respectively). The most remarkable result was where the RPL25 promoter was substituted for the BTS1 promoter; this resulted in ~47 copies of yEGFP per genome and a ~92-fold increase yEGFP fluorescence (Fig. 2c, e).
To further increase copy number at the SEC23 locus, we attenuated translation by making a construct with three non-preferred glycerine codons (GGA) inserted following the start codon of SEC23 under the control of the COG7 promoter (Fig. 2a: Construct 8), which delivered the most gene amplification in the first round (9 copies). A slight increase in gene copy and fluorescence was obtained (Fig. 2c, e). Translational downregulation by use of non-preferred codons provides a second mechanism to drive an increase in copy number for genes at haploinsufficient gene loci.
In the initial design (Fig. 1), we include ARS in the module basing on the genetic features at naturally amplified genomic loci. To confirm the role of ARS in the current system, we removed the ARS sequence in the Construct 3. The ARS-removed construct could lead to the formation of the very fluorescent colonies after transformation (Supplementary Fig. 1). This indicates that ARS may not be essential for HapAmp.
Increased copy number did not negatively impact the growth rate of any of the strains except for clones with the PBTS1-RPL25 construct (Fig. 2b), which had an exceptionally high integration copy number (Fig. 2c). This strain showed an ~7% decrease in growth rate (two-tailed t-test p = 0.001).
Long-read sequencing on strains containing Constructs 3 and 4 confirmed that the constructs were integrated into the RPL25 (YOL127W) locus and that yEGFP-RPL25 sequences were amplified in tandem repeat structures (Supplementary Figs. 2 and 3–5). The strain expressed the highest level of yEGFP (Construct 4) was sub-cultured in yeast extract-peptone-glucose medium for ~48 generations for stability test (Supplementary Fig. 6). GFP fluorescence levels and population homogeneity did not change, indicating that HapAmp is genetically stable.
Improving heterologous production of the sesquiterpene trans-nerolidol
We examined the performance of the HapAmp method using sesquiterpene (C15; trans-nerolidol) production. We used a background strain with an upregulated mevalonate pathway for production of terpene precursors (o401R)35,36,37,38. In this strain, the GAL80 repressor gene is disrupted allowing diauxic induction of GAL promoters, which are used to control transgenes.
We constructed a reference strain N401-1 harbouring a multi-copy 2μ plasmid pJT9RFR39 (Fig. 3a) with overexpression cassettes for farnesyl pyrophosphate synthase (ERG20) and nerolidol synthase (Ac.NES1). The nerolidol synthase cassette includes a fluorescence-activating and absorption-shifting tag (Y-FAST)40 and a 2A peptide from Equine rhinitis B virus 141 fused to the N-terminus of nerolidol synthase. This allows Y-FAST fluorescence to be used as a proxy for nerolidol synthase expression39.
The nerolidol synthase expression cassette (Y-FAST-2A-Ac.NES1) was cloned into the RPL25 insertion vector in the amplification region with three different promoters for replacement of the RPL25 promoter; the ERG20 expression cassette was cloned at the non-amplification region (Fig. 3b). Colonies with bright Y-FAST fluorescence were selected from the transformation plates. This delivered strains N401-2, N401-3, & N401-4 (promoters PERG1, PPDA1, and PBTS1, respectively).
Compared to the reference strain N401-1, these three strains exhibited faster growth (Fig. 3c, d), higher Y-FAST fluorescence (Fig. 3f), and higher nerolidol production (Fig. 3h). The Y-FAST-2A-Ac.NES1 cassette was successfully amplified in vivo in the three test strains (Fig. 3e).
The reference 2μ plasmid strain harboured 14 copies of the Y-FAST-2A-AcNES1 construct, similar to strain N401-3, and higher than that in strain N401-2. However, N401-1 had the lowest Y-FAST fluorescence (Fig. 3f). The discrepancy between copy number and fluorescence was due to lack of induction of Y-FAST expression in a large proportion of N401-1 cells (Fig. 3g). In contrast to the 2μ plasmid strain, the strains harbouring the in vivo amplification constructs showed better synchronicity for Y-FAST induction (Fig. 3g N401-3; others not shown). This may contribute to the improved production.
Improving heterologous production of the monoterpene limonene
We next tested the system on production of monoterpenes (C10). Monoterpene production requires introduction of a dedicated C10 geranyl pyrophosphate (GPP) synthase42. We have previously used an Erg20pN127W mutant42, which excludes the C15 chain from the active site to generate a GPP pool, in combination with targeted degradation of the endogenous C15 synthase Erg20p via protein degron tags35,39 to decrease competition at the C10 node by Erg20p and redirect GPP towards monoterpene production. In mevalonate pathway-enhanced strains, this approach delivered less than 100 mg l−1 monoterpene—an order of magnitude below the levels achieved for sesquiterpene engineering.
We used a mevalonate pathway-enhanced strain with the endogenous Erg20p under an auxin-inducible protein degradation mechanism39 as a background strain to minimise flux competition through the native sterol pathway. Two different promoter constructs were developed for amplification of the limonene synthetic module (Fig. 4a). The amplified region contained a fusion of multiple genes: Y-FAST-2A39, the maltose-binding protein from E. coli for improved solubility43, a short linker, limonene synthase from Citrus limon35, a 6*glycerine linker, and the Erg20p N127W F96W mutant42 (which has a higher specific GPP production rate than the Erg20pN127W mutant) as a GPP synthase. This fusion construct was under the control of the GAL2 promoter from S. kudriavzevii44. The two constructs were transformed into the RPL25 locus in the background strain, delivering strains LIM141M (PPDA1) and LIM141MH (PBTS1).
For the reference strain, the construct was introduced into the background strain via a 2μ plasmid (Fig. 4a). We characterised four biological replicates (LIM141R representing three biological replicates and LIM141R2 representing one biological replicate; Fig. 4). In this case, 2μ plasmid delivered ~2 copies per genome of the limonene synthase/Y-FAST module (shown by Y-FAST copy number; Fig. 4c). LIM141R, the three biological replicates produced ~40 mg l−1 limonene (Fig. 4f), the titre same to a previous strain LIM141 expressing limonene synthase and Erg20pN127W without gene fusion39. However, one biological replicate (LIM141R2, Fig. 4) produced ~300 mg l−1 limonene. LIM141R2 exhibited faster growth and higher Y-FAST fluorescence levels than other three biological replicates (LIM141R, Fig. 4b, d, e). The improvement in LIM141R2 may be caused by unintended genetic variations.
Harbouring HapAmp limonene synthetic module, both strains LIM141M and LIM141MH produced an order of magnitude more limonene than LIM141R and previous efforts using 2µ plasmids35,39, with the best production, ~0.95 g l−1 limonene at 96 h, by strain LIM141M (Fig. 4f). This titre is 5.6-fold higher than the previous highest titre ever obtained in yeast45, and ~2-fold higher than the best titres achieved in batch cultivation in E. coli46,47. Strain LIM141MH showed a slower exponential growth and the lower levels of Y-FAST fluorescence compared to strain LIM141M (Fig. 4b, d, e), despite having more copies of the limonene synthase/Y-FAST module (shown by Y-FAST copy number; Fig. 4c). Both strains also accumulated ~12 mg l−1 of the monoterpene alcohol geraniol, which is commonly produced by yeast with an increased GPP pool35,39. No farnesol (C15 alcohol) or geranylgeraniol (C20 alcohol) were accumulated by the strains, indicating that subcellular pools of FPP and the C20 geranylgeranyl pyrophosphate (GGPP) were low, and that amplification of limonene synthetic module led to significant redirection of the carbon flux towards monoterpene production.
Improving heterologous tetraterpenoid lycopene production in yeast
A three-gene lycopene synthetic module controlled by GAL promoters was previously constructed in a 2μ plasmid37 (Fig. 5a). This construct includes the farnesyl pyrophophase mutant gene ERG20F96C which produces GGPP48, a phytoene synthase49,50, and a lycopene-forming phytoene desaturase mutant50. This plasmid was transformed into a mevalonate pathway-enhanced background strain, generating strain LYC137. This strain accumulated ~5 mg lycopene per gram of biomass in 120-h flask cultivation (Fig. 5b).
The lycopene synthetic module was sub-cloned into both the PDA1 and BTS1 promoter RPL25-driving HapAmp vectors (Fig. 5a). The resulting constructs were transformed into the same background strain, generating strains LYC4 and LYC5, respectively. Strain LYC4 (PPDA1-RPL25) accumulated slightly more lycopene than strain LYC1, although the increase was not significant (Fig. 5b). Strain LYC5 accumulated ~25 mg lycopene per gram of biomass, five-fold higher than strain LYC1 (Fig. 5b).
High-level expression of heterologous proteins in yeast
S. cerevisiae can be used as a platform organism for protein production, including production of pharmaceutical proteins. However, a notorious disadvantage is that heterologous proteins production is not as high as what is achievable with E. coli expression systems. The high-level expression in E. coli can be attributed to the usage of high-copy-number plasmids (such as the common pET vectors with copy number about ~15–20) and the use of a very strong inducible promoter51. We used the PBTS1-RPL25-driving HapAmp constructs to introduce the AeBlue chromoprotein gene52 (Fig. 6a) or the EforRed chromoprotein gene53. Blue or pink colonies were obtained on the transformation plates (Supplementary Fig. 7), indicating high-level expression of the chromoproteins.
Having confirmed that the chromoproteins were effective markers, we then inserted a human papillomavirus (HPV) 16 major capsid protein L1 gene after the AeBlue expression cassette (Fig. 6a) to test the system for production of a pharmaceutical protein. For a reference, we cloned AeBlue-and-HPV16-L1 expression cassettes into a yeast 2μ plasmid (Fig. 6a). To compare the efficiency of protein production in different systems, an empty 2μ plasmid, the AeBlue-and-HPV16-L1 2μ plasmid, the RPL25-amplifiable AeBlue construct, and the RPL25-amplifiable AeBlue-and-HPV16-L1 construct were transformed individually into CEN.PK (gal80Δ). The four resulting strains were grown in MES-buffered YNB medium with 20 g l−1 glucose aerobically for 72 h. Cells with multi-copy integration of the AeBlue expression cassette showed a strong Tibetan blue colour, while cells with an empty cassette were milky white colour (Fig. 6b). The cells with 2μ plasmid containing AeBlue + HPV-L1 expression cassettes were a faint blue colour, whereas the cells with multi-copy integration of AeBlue + HPV-L1 expression cassettes displayed the strong Tibetan blue colour (Fig. 6b). This indicated superior expression capacity from the in vivo amplification method for multi-copy genome integration, compared to conventional 2μ plasmid method.
SDS-PAGE of whole-cell and soluble protein extracts showed bands at ~25 kD (AeBlue molecular weight) in all samples, with much stronger bands observed in the multi-copy integration strain samples than in the 2μ plasmid strain samples (Fig. 6d). In the multi-copy integration strains, these bands represented ~3% of whole-cell protein, suggesting heterologous protein expression in yeast may reach the levels often obtained in E. coli.
A second strong band at ~50 kD band (HPV16-L1 molecular weight) was observed in samples from cells expressing HPV-L1, although it was not as distinct at the putative AeBlue band (Fig. 6d). This may be due to the use of the Se.GAL2 promoter, which is not fully induced in the ethanol phase, in these constructs compared to the constitutive ALD6 promoter used for the AeBlue expression cassette. Again, the bands in the multi-copy integration strain samples were stronger than the 2μ plasmid samples. Surprisingly, considering that HPV16-L1 is a soluble protein54, these bands were not distinguishable in lysate supernatant samples.
To fully induce the Se.GAL2 promoter for HPV16-L1 expression, we attempted to grow the plasmid and integration strains harbouring HPV16-L1 in synthetic minimal medium (YNB) with ethanol or galactose as the carbon source. However, these cultivation conditions were lethal for the multi-copy-integration cells. We then grew the cells in rich (yeast-peptone (YP)) medium with 20 g l−1 galactose as the carbon source. Under these conditions, AeBlue expression from 2μ plasmid was not observable by visual examination (Fig. 6b) or SDS-PAGE (Fig. 6d). This may be due to loss of 2μ plasmid in the rich medium. In contrast, strong AeBlue-specific and HPV16-L1-specific bands were seen in whole-cell lysate and lysate supernatant samples from the cells with multi-copy integration constructs. This further confirmed that HPV16 L1 capsid protein is insoluble in yeast in our system. Attempts to solubilise HPV16-L1 L1 capsid protein were unsuccessful (data not shown). Despite being unable to detect HPV16-L1-specific bands in lysate supernatant (Fig. 6d), we could still separate properly assembled virus-like particles (VLPs) by ultracentrifugation of lysate supernatant (Fig. 6c). SDS-PAGE examination of VLP components purified from ultracentrifugation showed a HPV16-L1-specific band at ~50 kD (Fig. 6d; Lane VLPs:4). TEM images of the VLPs showed their diameter was around 40 nm (Fig. 6c), consistent with previous literature55.
In SDS-PAGE results, we observed strong bands in the lysate supernatant sample (band d1) and lysate pellet samples (bands d2, d3, and d4) (Fig. 6d). LC-MS/MS-based proteomics was used to analyse the protein composition in these four bands (Supplementary Data 1–4). The top hit protein in the ~50 kD band (band d2) was the HPV16 L1 capsid. Interestingly, the top hit proteins in other three bands (d1, d3 and d4) were yeast chaperones. In bands d1 and d3, the top hit proteins were HSP70 family chaperone Ssa1, and in bands d4, the top hit protein was HSP90 family chaperone Hsc82. We therefore hypothesised that insoluble expression of HPV16-L1 caused upregulation of yeast chaperones, and HPV16-L1, HSP70 chaperones, and HSP90 chaperones might exist in insoluble forms. However, it would require further systematic examination to get a better understanding of these phenomena.
In summary, although some insoluble expression of the HPV16 L1 was observed, our results both with chromoprotein AeBlue and the HPV16 L1 showed that multi-copy gene integration via HapAmp method can lead to heterologous protein overexpression in yeast to the high levels that are commonly seen in E. coli expression systems.
Here, we developed a genetic engineering method to integrate multiple copies of heterologous gene(s) into the yeast genome using in vivo gene amplification driven by a haploinsufficient gene (HapAmp). The functional strength per copy of a haploinsufficient gene is strongly associated with growth fitness, which can be exploited as an evolutionary force to drive gene amplification. Decreased expression level provides an evolutionary force that drives amplification of linked haploinsufficient and heterologous genes, so that cells are growth competitive. We exemplified the application of this method to improve production of different types of terpene products. We also showed that our method enabled high-level expression of heterologous protein in yeast, at levels similar to that achieved in E. coli for protein production.
This method presents three main advantages for the introduction of heterologous genes via genome integration. Firstly, integration copy number can be titrated by altering expression dosage per copy of haploinsufficient gene. Expression level can be reduced by a variety of methods. Here, we tested two approaches: (1) replacing the gene promoter with a weaker promoter (Figs. 2–4), and (2) using non-preferred codons (Fig. 2). In these experiments, we observed a range of between 4 and 47 copies, with an inverse relationship between promoter strength and copy number. We characterised a range of weak promoters here (Supplementary Fig. 8) and in previous work3 that can be applied to decrease gene dosage. In addition to promoter strength and codon usage, other approaches could be used to decrease expression dosage, including engineering the kozak sequence and/or the 5′-mRNA structure. These genetic tools add engineering flexibility to modify copy number for this HapAmp method in yeast.
Secondly, the maintenance of integration is auto-selectable: selection pressure is provided from the dosage sensitivity of the haploinsufficient gene, which is linked to the gene of interest and is maintained to support normal growth rates. This means that no antibiotics or modification of other environmental conditions in the culture are required to provide ongoing selection pressure for maintenance of the gene of interest. Compared to use of a 2µ plasmid, this method provides more stable expression of heterologous proteins in yeast (Fig. 6b). In addition, it does not require chemical induction for amplification2,15.
Thirdly, the presence of multiple haploinsufficient genes means that many different loci are available for engineering gene amplification. We demonstrated the method using RPL25 and SEC23 as the driving gene. We further characterised the promoter strength of fifteen additional haploinsufficient genes (Supplementary Fig. 8) that can also be used to drive gene amplification.
Initial integration of the genes of interest uses standard yeast transformation procedures by selection of an auxotrophic or antibiotic marker (e.g. LEU2 or hphMax in Figs. 2–6). Upon transformation, we observed a variable proportion of false clones (not expressing the gene of interest) on the transformation plates (Supplementary Figs. 1 and 5). We presume that, in these cases, either spontaneous mutations have provided the yeast an alternative mechanism to recover growth rate, or the gene of interest was not correctly integrated into the target locus. Use of visual markers (fluorescent proteins or chromoproteins) can facilitate the selection of correct clones with amplified constructs. In the absence of such visual markers, characterisation or verification of a pool of clones would be necessary to select clones with multi-copy integration of heterologous genes. Further optimising the genetic background of the yeast strains used, such as eliminating the non-homologous end joining mechanism to decrease non-homologous gene integration, might be useful to eliminate rate of false positives for the current method.
The HapAmp method successfully improved production of heterologous terpenes including the C15 sesquiterpene nerolidol (Fig. 3), the C10 monoterpene limonene (Fig. 4), and the C40 tetraterpene lycopene (Fig. 5). Production of C15 terpenes in yeast is typically relatively straightforward, with gram per litre titres achievable39,56. This is likely because the C15 precursor, FPP, is produced in yeast naturally to deliver sterol pathway products required for yeast growth. In addition, sesquiterpene synthases have reasonably good catalytic properties, making them more competitive to access FPP. Production of C10 monoterpenes, however, has historically been very challenging. This is due to both a dearth of C10 precursors57 and the poor catalytic properties of many monoterpene synthases45,58. These limitations have previously restricted published titres of monoterpenes to mg l−1 in flask cultivation35,39,45,59. Here, we have achieved g l−1 titres (Fig. 4) in a single engineering step using a high mevalonate pathway flux strain with an introduced GPPS and targeted degradation of FPPS to decrease competition at the C10 pathway node. We believe this is the highest titre achieved in metabolically engineered microbes in a flask cultivation with 20 g l−1 glucose as carbon source reported to date.
Interestingly, one replicate of the monoterpene control strain produced ~300 mg l−1 limonene, in comparison to the other three replicates which produced ~40 mg l−1 limonene, despite the plasmid copy number being the same in all four replicates (Fig. 4). This suggests that an unintended mutation has arisen in this strain which affects limonene production positively. The source of this variation will be examined in future work and may form the basis of further engineering efforts.
We observed a tight correlation between gene copy number and GFP fluorescence (Fig. 2); however, this relationship breaks down for the different terpene products, resulting in variable improvement ratios. This is most likely due to the fact that the relationship between the GFP peptide and its fluorescence is very close and does not rely on other factors such as substate and cofactor availability—whereas the terpene synthases are enzymes and subject to more influences on their behaviour. In addition, variable metabolic burden caused by overexpression of terpene synthases or other physiological perturbations in metabolically engineered systems may affect the relationship between copy number and product titre. For products, limonene production improvement was ~24-fold, whereas nerolidol improvement was 1.7-fold, and lycopene improvement was 5-fold. However, we always obtained a higher titre by in vivo gene amplification. In particular, for monoterpenes, insufficient catalytic efficiency of terpene synthase is a significant bottleneck for production of heterologous terpenoids in yeast. Increasing copy number via insertion of tandem repeats at the same locus combined with screening for improved production56 or introduction of additional expression cassettes at separate loci43 has been used to overcome this bottleneck previously. However, these approaches require complex cloning and extended experimental timelines to deliver the desired improvements. The HapAmp system provides a faster and simpler method to achieve superior results.
We tested several constructs ranging up to three expression cassettes (lycopene pathway: insert size of 7917 bp). We have not sought to test the maximum cargo size for this approach. However, we did not observe a clear relationship between size of the insert (‘cargo’) and copy number amplification, suggesting that even larger inserts may be possible for the technique.
In addition to its application in metabolic engineering, we also examined the potential of HapAmp for increasing heterologous protein production. Using chromoprotein AeBlue and the HPV16 L1 capsid protein as examples (Fig. 6), we demonstrated that in S. cerevisiae, heterologous protein could be produced at levels commonly seen in E. coli. AeBlue was expressed in soluble form, whereas HPV16 L1 capsid protein was primarily expressed in insoluble form. Insoluble expression of HPV16 L1 capsid protein has been reported in E. coli60,61,62 but not in S. cerevisiae. In E. coli, N-terminal truncation60,61, use of a fusion partner61, and overexpression GroEL/GroES chaperones62 (which accept broader substrates than cytosolic chaperones in S. cerevisiae63,64), improved soluble expression of HPV L1 capsid proteins. These strategies might also improve soluble expression of HPV capsid proteins in yeast.
The HapAmp method should be applicable in other industrially relevant chassis organisms that have haploinsufficient genes. A potential haploinsufficient gene may encode essential components of the machineries for protein synthesis and transportation or other essential cell structures28. Putative haploinsufficient genes can be identified by comparative genomics and confirmed by testing growth fitness in association with expression dosage of a gene. For diploid organisms, this can be done by disrupting one allele and integrating the amplifiable construct at the other allele locus, or by simultaneously integrating the amplifiable constructs at both alleles. In addition, native non-homologous end joining mechanisms can be diminished/disrupted to improve the successful rate of amplification of genes of interests65. A nuclease-mediated DNA double-chain break like CRISPR66 could also be used to assist the integration of the amplifiable construct. This may avoid the use of a selectable marker in the gene amplification construct.
Plasmid and strain construction
Plasmids used in this work are listed in Supplementary Data 5, and strains are listed in Supplementary Data 6. Primers used in polymerase chain reaction (PCR) and PCR performed in this work are listed in Supplementary Data 7. Plasmid construction processes are listed in Supplementary Data 8. Yeast strain construction processes are listed in Supplementary Data 9. A LiAc/SS carrier DNA/PEG method67 was used for yeast transformation.
For characterisation of yEGFP-expressing strains, yeast cells from glycerol stocks were streaked on YNB-glucose agar, which comprised of 6.9 g l−1 yeast nitrogen base without amino acids (YNB, FORMEDIUM#CYN0402) with pH adjusted to 6.0 using sodium hydroxide solution, 20 g l−1 glucose, and 20 g l−1 agar. MES-buffered YNB-glucose medium was used in following cultivation, which comprised of 19.5 g l−1 2-(N-morpholino)ethanesulfonic acid (MES), 6.9 g l−1 YNB, 20 g l−1 glucose, and its pH was adjusted to 6.0 with ammonia hydroxide solution. For the growth in flask, seed cultures grown to the exponential phase (OD600 ≤ 4) were inoculated into 20 ml MES-buffered YNB-glucose medium in 125 ml Erlenmeyer flasks to start the cultivation in a 200 rpm 30 °C incubator. For the growth in 96-well microplate, yeast cells were grown in YNB-glucose medium (6.9 g l−1 YNB, 20 g l−1 glucose, pH 6.0) for about 20 h to stationary phase in a 350 rpm 30 °C incubator to prepare seed culture. Seed culture (5 μl) was inoculated into 100 μl MES-buffered YNB-glucose medium to prepare Culture 1. Culture 1 (2 μl) was inoculated into 100 μl MES-buffered YNB-glucose medium to prepare Culture 2. Culture 2 was incubated in a 350 rpm 30 °C incubator overnight for analysis of yEGFP fluorescent in the cells grown to the exponential growth phase, and Culture 1 for two nights for analysis in the cells grown to the ethanol growth phase.
For characterisation of nerolidol/limonene-producing strains, dodecane-overlayed two-phase flask cultivation was used. Yeast cells from glycerol stocks were streaked on YNB-high-glucose agar, which contained 6.9 g l−1 YNB (pH 6.0), 200 g l−1 glucose, and 20 g l−1 agar. Before initiating the two-phase flask cultivation, cells were pre-cultured in MES-buffered YNB-20 g l−1 glucose to exponential phase (OD600 between 1 to 4) and collected by centrifugation. Collected cells were then resuspended in fresh fermentation medium. To initiate the cultivation, appropriate volumes of pre-cultured cells were transferred to MES-buffered YNB medium with 20 g l−1 glucose to an initial OD600 of 0.2 in a total volume of 23 ml medium in a 250 ml flask, and 2 ml sterile dodecane was added after inoculation. In the first 12 h of cultivation, 3 ml culture was sampled for growth curve measurement. Dodecane was sampled and stored at −80 °C for terpene analysis.
Flask cultivations for lycopene-producing strains were prepared as the flask cultivation used for yEGFP-expressing strains. For chromoprotein/HPV16 L1-expressing strains, yeast cells grown overnight in 5 ml MES-buffered YNB-glucose medium were inoculated into 20 ml fresh MES-buffered YNB-glucose medium or 20 ml YP-galactose (20 g l−1 peptone, 10 g l−1 yeast extract, and 20 g l−1 galactose) to start characterisation cultures.
A BD Accuri™ C6 flow cytometer (BD Biosciences, USA) was used for fluorescence analysis in single cells. Cells expressing yEGFP were sampled and directly used for characterisation of the yEGFP fluorescence. Cells expressing Y-FAST was sampled and mixed with 20 μM HMBR (synthesised and prepared in 2 mM stock in dimethyl sulfoxide40) before analysis. Debris particles were excluded through an FSC.H threshold with the threshold value of 250,000. A 488 nm laser was used to excite GFP and Y-FAST fluorescence. The detector equipped with a 530/20 bandpass filter was used to monitor the fluorescence (FL1.A). For each sample, 10,000 events were recorded. A BD Csampler software (BD Accuri C6 software version 1.0.264.21) were used to extract mean values of FSC.A, SSC.A, and FL1.A. The fluorescence level of GFP and Y-FAST was expressed as the fold of a background fluorescence in the exponential grown phase cells of strain GH43.
HPLC analysis was performed by the Metabolomics Australia (Queensland node) using a previously described method68. In brief, an Agilent 1200 HPLC system and a Thermo Fisher Chromeleon Chromatography Data System software were used. Dodecane samples in some cases were diluted with dodecane before HPLC analysis. For HPLC analysis, 5 μl dodecane samples (or standards prepared in dodecane) were mixed with 200 μl ethanol, and 20 μl mixture was injected and separated with a guard column (SecurityGuard Gemini C18, Phenomenex PN: AJO-7597) and a Zorbax Extend C18 column (4.6 × 150 mm, 3.5 µm, Agilent PN: 763953-902). The mixture of solvent A (water) and solvent B (45% acetonitrile, 45% methanol, and 10% water) was used to elute the analytes with a linear gradient (from 0–24 min, 5–100% solvent B; from 24–30 min, 100% solvent B; from 30.1–35 min, 5% solvent B).
For lycopene measurement, yeast cells were collected and resuspended in 200 μl 2 M l−1 sodium hydroxide and vortexed with 200 mg glass bead and 1 ml hexane for at least 10 min. Lycopene molar extinction coefficient (182 × 103) at 471 nm was used to calculate lycopene concentration69. In some cases, lycopene extracts were diluted with hexane to make the absorbance reading <0.6.
Yeast cells were homogenised by vortexing with glass beads for 15 min in phosphate-buffered saline (PBS) buffer plus 2 mM ethylenediaminetetraacetic acid. Whole-cell lysates, lysate supernatants, and lysate pellets were examined by sodium dodecyl sulphate-polyacrylamide gel electrophoresis analysis on Mini-PROTEAN® Precast Gels (Bio-rad).
The lysis was followed by centrifugation at 18,000 × g for 30 min to pellet the cellular debris. The soluble fraction was then loaded on top of a gradient made of 1 ml of 20% Iodixanol/PBS buffer, 1 ml of 30% Iodixanol/PBS and 1 ml of 40% Iodixanol/PBS in a Thinwall Ultra-Clear Tube (Beckman Coulter, Indianapolis, USA) and subjected to ultracentrifugation for 2 h 30 min at 150,000 × g on a SW41 Ti rotor or a using a Beckman Optima L-100XP ultracentrifuge (Beckman Coulter, Indianapolis, USA). A band containing the VLPs encapsulating protein was extracted using a 1 ml syringe by poking a whole through the tube. Bradford was used to measure protein concentration and sample was further examined on TEM and purity confirmed on Mini-PROTEAN® Precast Gels (Bio-rad).
Transmission electron microscopy
Samples containing purified VLPs of 0.1 mg ml−1 were applied to formvar/carbon coated grids (ProSciTech Pty Ltd, Australia) and incubated for 2 min. Grids were then washed with 40 μl of distilled water for 30 s twice, and then stained with 20 g l−1 uranyl acetate for 1 min, after being blotted on filter paper. Images were taken on a HITACHI HT7700 transmission electron microscope at accelerating voltage of 80 keV at the Centre for Microscopy and Microanalysis.
Yeast genomic DNA was extracted using MagAttract HMW DNA Kit (Qiangen) with a modified protocol. Yeast cells (20 ml, OD600 around 10) were washed once using PBS buffer and resuspend in 2 ml 1 M sorbitol solution. Yeast cell walls were digested by adding 30 U Zymolyase-20T (nacalai, Japan; 1 U per μl in 1* PBS containing 100 mM DTT and 50% v/v glycerol) at 30 °C for 30 min. Yeast protoplast cells were collected and resuspended in 300 μl Buffer AL (MagAttract HMW DNA Kit) by pipetting using wide bore pipette tips, and then 360 buffer ATL (MagAttract HMW DNA Kit) was added and mixed. Following this, protocol provided in MagAttract HMW DNA Kit (Qiangen) was adopted including digestion by Proteinase K and Rnase A and purification using magnetic beads. Genomic DNA was eluted using 400 μl Buffer AE (MagAttract HMW DNA Kit) and treated using 100 μl tris-saturated phenol (pH 8.0, Ameresco) by flickering and 100 μl chloroform was added and mixed. Upper-layer water phase was collected after centrifuging at 17,000 × g for 5 min and mixed with 1 ml ethanol. Magnetic beads (MagAttract HMW DNA Kit) were used to purify genomic DNA with twice 70% ethanol wash and elution in 50 μl water. Concentration of genomic DNA was quantified using Qubit Fluorometer and Qubit dsDNA BR Assay Kit (Thermo Fisher). Genomic DNA (500 ng) was used to prepare genome sequencing library using Rapid Barcoding Kit (SQK-RBK004, Oxford Nanopore) and sequenced using R9 flowcell MIN106D and MinION Mk1C (Oxford Nanopore). High-accurate base-calling was performed using ont-guppy-for-mk1c (version 4.2.3) installed MinION Mk1C (MinKNOW version 20.10.6). Galaxy Australia online server was used for data processing70. Collapse Collection (Galaxy Version 5.1.0) was used to combine fastq dataset into a single file. Nanoplot was used for statistical analysis of MinION reads71. Canu assembler was used for genome sequence assembly72. Maker (Galaxy Version 2.31.11) was used to collect annotation evidence with input of S. cerevisiae gene sequences and heterologous gene sequences as ESTs input file73. miniMap2 was used to align trimmed reads outputted by Canu assembler against contigs outputted by Canu assembler74. JBrowse (version 1.16.10-desktop)75 and Integrative Genomics Viewer (version 2.8.13)76 were used to illustrate genome structure and read alignment.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
MinION whole genome sequencing raw-read data are achieved in NCBI BioProject database with submission ID PRJNA688119. Processed data for MinION genome sequencing are achieved in Zenodo (https://zenodo.org/record/6378077#.YnPhi9rMI2w; https://doi.org/10.5281/zenodo.6378077). Plasmids used in this study are available on request or on Addgene (Addgene IDs: 185870-185894) (https://www.addgene.org/Claudia_Vickers/). Source data are provided with this paper.
Vickers, C., Blank, L. & Kromer, J. Grand challenge commentary: chassis cells for industrial biochemical production. Nat. Chem. Biol. 6, 875–877 (2010).
Tyo, K. E., Ajikumar, P. K. & Stephanopoulos, G. Stabilized gene duplication enables long-term selection-free heterologous pathway expression. Nat. Biotechnol. 27, 760–765 (2009).
Peng, B., Williams, T., Henry, M., Nielsen, L. & Vickers, C. Controlling heterologous gene expression in yeast cell factories on different carbon substrates and across the diauxic shift: a comparison of yeast promoter activities. Microb. Cell Factories 14, 91 (2015).
Stephanopoulos, G. & Vallino, J. J. Network rigidity and metabolic engineering in metabolite overproduction. Science 252, 1675–1681 (1991).
Gnugge, R. & Rudolf, F. Saccharomyces cerevisiae Shuttle vectors. Yeast 34, 205–221 (2017).
Karim, A. S., Curran, K. A. & Alper, H. S. Characterization of plasmid burden and copy number in Saccharomyces cerevisiae for optimization of metabolic engineering applications. FEMS Yeast Res. 13, 107–116 (2013).
Eguchi, Y. et al. Estimating the protein burden limit of yeast cells by measuring the expression limits of glycolytic proteins. eLife 7, e34595 (2018).
Lopes, T. S., Hakkaart, G. J. A. J., Koerts, B. L., Raue, H. A. & Planta, R. J. Mechanism of high-copy-number integration of Pmiry-type vectors into the ribosomal DNA of Saccharomyces-Cerevisiae. Gene 105, 83–90 (1991).
Loison, G., Nguyen-Juilleret, M., Alouani, S. & Marquet, M. Plasmid–transformed ura3 fur1 double-mutants of S. cerevisiae: an autoselection system applicable to the production of foreign proteins. Bio/Technology 4, 433–437 (1986).
Yoo, J. I., Seppälä, S. & OʼMalley, M. A. Engineered fluoride sensitivity enables biocontainment and selection of genetically-modified yeasts. Nat. Commun. 11, 5459 (2020).
Song, X. et al. POT1-mediated delta-integration strategy for high-copy, stable expression of heterologous proteins in Saccharomyces cerevisiae. FEMS Yeast Res. 17, fox064 (2017).
Kawasaki, G. H. & Bell, L. Google Patents (1999).
Maury, J. et al. EasyCloneMulti: a set of vectors for simultaneous and multiple genomic integrations in Saccharomyces cerevisiae. PLoS ONE 11, e0150394 (2016).
Yamada, R. et al. Cocktail delta-integration: a novel method to construct cellulolytic enzyme expression ratio-optimized yeast strains. Microb. Cell Factories 9, 32 (2010).
Lian, J. Z., Jin, R. & Zhao, H. M. Construction of plasmids with tunable copy numbers in Saccharomyces cerevisiae and their applications in pathway optimization and multiplex genome integration. Biotechnol. Bioeng. 113, 2462–2473 (2016).
Shi, S. B., Liang, Y. Y., Zhang, M. Z. M., Ang, E. L. & Zhao, H. M. A highly efficient single-step, markerless strategy for multi-copy chromosomal integration of large biochemical pathways in Saccharomyces cerevisiae. Metab. Eng. 33, 19–27 (2016).
Parekh, R. N., Shaw, M. R. & Wittrup, K. D. An integrating vector for tunable, high copy, stable integration into the dispersed Ty δ sites of Saccharomyces cerevisiae. Biotechnol. Prog. 12, 16–21 (1996).
Tumen-Velasquez, M. et al. Accelerating pathway evolution by increasing the gene dosage of chromosomal segments. Proc. Natl Acad. Sci. USA 115, 7105–7110 (2018).
Zhou, H., Cheng, J. S., Wang, B. L., Fink, G. R. & Stephanopoulos, G. Xylose isomerase overexpression along with engineering of the pentose phosphate pathway and evolutionary engineering enable rapid xylose utilization and ethanol production by Saccharomyces cerevisiae. Metab. Eng. 14, 611–622 (2012).
Demeke, M. M., Foulquie-Moreno, M. R., Dumortier, F. & Thevelein, J. M. Rapid evolution of recombinant Saccharomyces cerevisiae for xylose fermentation through formation of extrachromosomal circular DNA. PLoS Genet. 11, e1005010 (2015).
Gibbons, J. G., Branco, A. T., Godinho, S. A., Yu, S. K. & Lemos, B. Concerted copy number variation balances ribosomal DNA dosage in human and mouse genomes. Proc. Natl Acad. Sci. USA 112, 2485–2490 (2015).
Fischer, U. et al. Gene amplification during differentiation of mammalian neural stem cells in vitro and in vivo. Oncotarget 6, 7023–7039 (2015).
Schimke, R. T. Gene amplification in cultured animal-cells. Cell 37, 705–713 (1984).
Oh, E. J. et al. Gene amplification on demand accelerates cellobiose utilization in engineered Saccharomyces cerevisiae. Appl. Environ. Microbiol. 82, 3631–3639 (2016).
Fogel, S. & Welch, J. W. Tandem gene amplification mediates copper resistance in yeast. Proc. Natl Acad. Sci. Biol. 79, 5342–5346 (1982).
Jack, C. V. et al. Regulation of ribosomal DNA amplification by the TOR pathway. Proc. Natl Acad. Sci. USA 112, 9674–9679 (2015).
Morrill, S. A. & Amon, A. Why haploinsufficiency persists. Proc. Natl Acad. Sci. USA 116, 11866–11871 (2019).
Deutschbauer, A. M. et al. Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Genetics 169, 1915–1925 (2005).
Ganley, A. R. D., Ide, S., Saka, K. & Kobayashi, T. The effect of replication initiation on gene amplification in the rDNA and its relationship to aging. Mol. Cell 35, 683–693 (2009).
Conti, C., Herrick, J. & Bensimon, A. Unscheduled DNA replication origin activation at inserted HPV 18 sequences in a HPV 18/MYC amplicon. Gene Chromosom. Cancer 46, 724–734 (2007).
Lu, L., Zhang, H. & Tower, J. Functionally distinct, sequence-specific replicator and origin elements are required for Drosophila chorion gene amplification. Genes Dev. 15, 134–146 (2001).
Newlon, C. S. et al. Analysis of replication origin function on chromosome III of Saccharomyces cerevisiae. Cold Spring Harb. Symp. Quant. Biol. 58, 415–423 (1993).
Liachko, I., Youngblood, R. A., Keich, U. & Dunham, M. J. High-resolution mapping, characterization, and optimization of autonomously replicating sequences in yeast. Genome Res. 23, 698–704 (2013).
Peng, B. et al. Engineering eukaryote-like regulatory circuits to expand artificial control mechanisms for metabolic engineering in Saccharomyces cerevisiae. Commun. Biol. 5, 135 (2022).
Peng, B., Nielsen, L. K., Kampranis, S. C. & Vickers, C. E. Engineered protein degradation of farnesyl pyrophosphate synthase is an effective regulatory mechanism to increase monoterpene production in Saccharomyces cerevisiae. Metab. Eng. 47, 83–93 (2018).
Peng, B., Plan, M. R., Carpenter, A., Nielsen, L. K. & Vickers, C. E. Coupling gene regulatory patterns to bioprocess conditions to optimize synthetic metabolic modules for improved sesquiterpene production in yeast. Biotechnol. Biofuels 10, 43 (2017).
Peng, B., Wood, R. J., Nielsen, L. K. & Vickers, C. E. An expanded heterologous GAL promoter collection for diauxie-inducible expression in Saccharomyces cerevisiae. ACS Synth. Biol. 7, 748–751 (2018).
Hayat, I. F. et al. Auxin-mediated induction of GAL promoters by conditional degradation of Mig1p improves sesquiterpene production in Saccharomyces cerevisiae with engineered acetyl-CoA synthesis. Micro. Biotechnol. 14, 2627–2642 (2021).
Lu, Z., Peng, B., Ebert, B. E., Dumsday, G. & Vickers, C. E. Auxin-mediated protein depletion for metabolic engineering in terpene-producing yeast. Nat. Commun. 12, 1051 (2021).
Plamont, M. A. et al. Small fluorescence-activating and absorption-shifting tag for tunable protein imaging in vivo. Proc. Natl Acad. Sci. USA 113, 497–502 (2016).
Souza-Moreira, T. M. et al. Screening of 2A peptides for polycistronic gene expression in yeast. FEMS Yeast Res. 18, foy036 (2018).
Ignea, C., Pontini, M., Maffei, M. E., Makris, A. M. & Kampranis, S. C. Engineering monoterpene production in yeast using a synthetic dominant negative geranyl diphosphate synthase. ACS Synth. Biol. 3, 298–306 (2013).
Wong, J. et al. High-titer production of lathyrane diterpenoids from sugar by engineered Saccharomyces cerevisiae. Metab. Eng. 45, 142–148 (2018).
Peng, B., Nielsen, L. K. & Vickers, C. E. An expanded heterologous GAL promoter collection for diauxie-inducible over-expression in Saccharomyces cerevisiae ACS Synth. Biol. 7, 748–751 (2017).
Ignea, C. et al. Orthogonal monoterpenoid biosynthesis in yeast constructed on an isomeric substrate. Nat. Commun. 10, 3799 (2019).
Alonso-Gutierrez, J. et al. Metabolic engineering of Escherichia coli for limonene and perillyl alcohol production. Metab. Eng. 19, 33–41 (2013).
Rolf, J., Julsing, M. K., Rosenthal, K. & Lutz, S. A Gram-scale limonene production process with engineered Escherichia coli. Molecules 25, 1881 (2020).
Ignea, C. et al. Efficient diterpene production in yeast by engineering Erg20p into a geranylgeranyl diphosphate synthase. Metab. Eng. 27, 65–75 (2015).
Xie, W., Lv, X., Ye, L., Zhou, P. & Yu, H. Construction of lycopene-overproducing Saccharomyces cerevisiae by combining directed evolution and metabolic engineering. Metab. Eng. 30, 69–78 (2015).
Verwaal, R. et al. High-level production of beta-carotene in Saccharomyces cerevisiae by successive transformation with carotenogenic genes from Xanthophyllomyces dendrorhous. Appl. Environ. Microbiol. 73, 4342–4350 (2007).
Kuliopulos, A. & Walsh, C. T. Production, purification, and cleavage of tandem repeats of recombinant peptides. J. Am. Chem. Soc. 116, 4599–4607 (1994).
Shkrob, M. A. et al. Far-red fluorescent proteins evolved from a blue chromoprotein from Actinia equina. Biochem. J. 392, 649–654 (2005).
Alieva, N. O. et al. Diversity and evolution of coral fluorescent proteins. PLoS ONE 3, e2680 (2008).
Carter, J. J., Yaegashi, N., Jenison, S. A. & Galloway, D. A. Expression of human papillomavirus proteins in yeast Saccharomyces cerevisiae. Virology 182, 513–521 (1991).
Mach, H. et al. Disassembly and reassembly of yeast-derived recombinant human papillomavirus virus-like particles (HPV VLPs). J. Pharm. Sci. 95, 2195–2206 (2006).
Meadows, A. L. et al. Rewriting yeast central carbon metabolism for industrial isoprenoid production. Nature 537, 694–697 (2016).
Vickers, C. E., Williams, T. C., Peng, B. & Cherry, J. Recent advances in synthetic biology for engineering isoprenoid production in yeast. Curr. Opin. Chem. Biol. 40, 47–56 (2017).
Peng, B. Modulating gene expression in yeast to optimize metabolic networks for sesquiterpene and monoterpene production. (2018).
Zhao, J. et al. Dynamic control of ERG20 expression combined with minimized endogenous downstream metabolism contributes to the improvement of geraniol production in Saccharomyces cerevisiae. Microb. Cell Factories 16, 17 (2017).
Wei, M. X. et al. N-terminal truncations on L1 proteins of human papillomaviruses promote their soluble expression in Escherichia coli and self-assembly in vitro. Emerg. Microbes Infect. 7, 160 (2018).
Chen, X. J. S., Casini, G., Harrison, S. C. & Garcea, R. L. Papillomavirus capsid protein expression in Escherichia coli: purification and assembly of HPV11 and HPV16 L1. J. Mol. Biol. 307, 173–182 (2001).
Pan, D., Zha, X., Yu, X. H. & Wu, Y. Q. Enhanced expression of soluble human papillomavirus L1 through coexpression of molecular chaperonin in Escherichia coli. Protein Expr. Purif. 120, 92–98 (2016).
Xia, P. F. et al. GroE chaperonins assisted functional expression of bacterial enzymes in Saccharomyces cerevisiae. Biotechnol. Bioeng. 113, 2149–2155 (2016).
Saibil, H. Chaperone machines for protein folding, unfolding and disaggregation. Nat. Rev. Mol. Cell Biol. 14, 630–642 (2013).
Hefferin, M. L. & Tomkinson, A. E. Mechanism of DNA double-strand break repair by non-homologous end joining. DNA Repair 4, 639–648 (2005).
DiCarlo, J. E., Chavez, A., Dietz, S. L., Esvelt, K. M. & Church, G. M. Safeguarding CRISPR-Cas9 gene drives in yeast. Nat. Biotechnol. 33, 1250–1255 (2015).
Gietz, R. D. & Schiestl, R. H. Large-scale high-efficiency yeast transformation using the LiAc/SS carrier DNA/PEG method. Nat. Protoc. 2, 38–41 (2007).
Peng, B. et al. A squalene synthase protein degradation method for improved sesquiterpene production in Saccharomyces cerevisiae. Metab. Eng. 39, 209–219 (2017).
Takehara, M. et al. Characterization and Thermal Isomerization of (all-E)-Lycopene. J. Agric. Food Chem. 62, 264–269 (2014).
Afgan, E. et al. Genomics virtual laboratory: a practical bioinformatics workbench for the cloud. PLoS ONE 10, e0140829 (2015).
De Coster, W., D’Hert, S., Schultz, D. T., Cruts, M. & Van Broeckhoven, C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34, 2666–2669 (2018).
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
Campbell, M. S., Holt, C., Moore, B. & Yandell, M. Genome annotation and curation using MAKER and MAKER-P. Curr. Protoc. Bioinforma. 48, 4.11.11–39 (2014).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Buels, R. et al. JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol. 17, 66 (2016).
Thorvaldsdottir, H., Robinson, J. T. & Mesirov, J. P. Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform 14, 178–192 (2013).
B.P. and this research were supported by a CSIRO synthetic biology future science fellowship and the University of Queensland. B.P. and C.E.V. acknowledge current support from Australian Research Council Centre of Excellence in Synthetic Biology and Queensland University of Technology. Metabolite analysis was performed by Dr Manual Plan in Metabolomics Australia (Bioplatform Australia) Queensland Node. HPLC-MS/MS analysis was performed by Dr Lian Liu in Bioplatform Australia Queensland Node. Yeast strains in this study derives from CEN.PK background strains, which were provided by EUROSCARF (Scientific Research and Development GmbH, Germany) under a non-commercial licence. The authors also acknowledge the facilities, and the scientific and technical assistance, of the Australian Microscopy and Microanalysis Research Facility at the Centre for Microscopy and Microanalysis, The University of Queensland.
The University of Queensland has filed two Australian provisional patents on the methods for gene amplification to claim the intellectual property (Inventors: B.P. and C.E.V. Australian Patent Application numbers: 2022900699 and 2022901094). C.E.V. has a financial interest in Provectus Algae. Other authors declare no competing interests.
Peer review information
Nature Communications thanks Antonios Makris and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Peng, B., Esquirol, L., Lu, Z. et al. An in vivo gene amplification system for high level expression in Saccharomyces cerevisiae. Nat Commun 13, 2895 (2022). https://doi.org/10.1038/s41467-022-30529-8