Metagenomic mining of regulatory elements enables programmable species-selective gene expression

Johns, Nathan I; Gomes, Antonio L C; Yim, Sung Sun; Yang, Anthony; Blazejewski, Tomasz; Smillie, Christopher S; Smith, Mark B; Alm, Eric J; Kosuri, Sriram; Wang, Harris H

doi:10.1038/nmeth.4633

Resource
Published: 19 March 2018

Metagenomic mining of regulatory elements enables programmable species-selective gene expression

Nature Methods volume 15, pages 323–329 (2018)Cite this article

9993 Accesses
62 Citations
126 Altmetric
Metrics details

Subjects

Abstract

Robust and predictably performing synthetic circuits rely on the use of well-characterized regulatory parts across different genetic backgrounds and environmental contexts. Here we report the large-scale metagenomic mining of thousands of natural 5′ regulatory sequences from diverse bacteria, and their multiplexed gene expression characterization in industrially relevant microbes. We identified sequences with broad and host-specific expression properties that are robust in various growth conditions. We also observed substantial differences between species in terms of their capacity to utilize exogenous regulatory sequences. Finally, we demonstrate programmable species-selective gene expression that produces distinct and diverse output patterns in different microbes. Together, these findings provide a rich resource of characterized natural regulatory sequences and a framework that can be used to engineer synthetic gene circuits with unique and tunable cross-species functionality and properties, and also suggest the prospect of ultimately engineering complex behaviors at the community level.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: High-throughput characterization of regulatory sequences from 184 prokaryotic genomes.**

**Figure 2: Transcriptional activity of the regulatory library across three diverse species.**

**Figure 3: Assessing regulatory features that govern transcriptional activity.**

**Figure 5: Species-selective gene circuits.**

A portable regulatory RNA array design enables tunable and complex regulation across diverse bacteria

Article Open access 29 August 2023

Engineering regulatory networks for complex phenotypes in E. coli

Article Open access 13 August 2020

Automated design of synthetic microbial communities

Article Open access 28 January 2021

Accession codes

Primary accessions

Sequence Read Archive

SRP131663

References

Brophy, J.A. & Voigt, C.A. Principles of genetic circuit design. Nat. Methods 11, 508–520 (2014).
Article CAS Google Scholar
Kosuri, S. & Church, G.M. Large-scale de novo DNA synthesis: technologies and applications. Nat. Methods 11, 499–507 (2014).
Article CAS Google Scholar
Bayer, T.S. et al. Synthesis of methyl halides from biomass using engineered microbes. J. Am. Chem. Soc. 131, 6508–6515 (2009).
Article CAS Google Scholar
Stanton, B.C. et al. Genomic mining of prokaryotic repressors for orthogonal logic gates. Nat. Chem. Biol. 10, 99–105 (2014).
Article CAS Google Scholar
Rhodius, V.A. et al. Design of orthogonal genetic switches based on a crosstalk map of σs, anti-σs, and promoters. Mol. Syst. Biol. 9, 702 (2013).
Article CAS Google Scholar
Kinney, J.B., Murugan, A., Callan, C.G. Jr. & Cox, E.C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl. Acad. Sci. USA 107, 9158–9163 (2010).
Article CAS Google Scholar
Kosuri, S. et al. Composability of regulatory sequences controlling transcription and translation in Escherichia coli. Proc. Natl. Acad. Sci. USA 110, 14024–14029 (2013).
Article CAS Google Scholar
Mutalik, V.K. et al. Quantitative estimation of activity and quality for collections of functional genetic elements. Nat. Methods 10, 347–353 (2013).
Article CAS Google Scholar
Sharon, E. et al. Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat. Biotechnol. 30, 521–530 (2012).
Article CAS Google Scholar
Alper, H., Fischer, C., Nevoigt, E. & Stephanopoulos, G. Tuning genetic control through promoter engineering. Proc. Natl. Acad. Sci. USA 102, 12678–12683 (2005).
Article CAS Google Scholar
Mutalik, V.K. et al. Precise and reliable gene expression via standard transcription and translation initiation elements. Nat. Methods 10, 354–360 (2013).
Article CAS Google Scholar
Lutz, R. & Bujard, H. Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic Acids Res. 25, 1203–1210 (1997).
Article CAS Google Scholar
Kang, M.K. et al. Synthetic biology platform of CoryneBrick vectors for gene expression in Corynebacterium glutamicum and its application to xylose utilization. Appl. Microbiol. Biotechnol. 98, 5991–6002 (2014).
Article CAS Google Scholar
Tauer, C., Heinl, S., Egger, E., Heiss, S. & Grabherr, R. Tuning constitutive recombinant gene expression in Lactobacillus plantarum. Microb. Cell Fact. 13, 150 (2014).
Article Google Scholar
Song, Y. et al. Promoter screening from Bacillus subtilis in various conditions hunting for synthetic biology and industrial applications. PLoS One 11, e0158447 (2016).
Article Google Scholar
Markley, A.L., Begemann, M.B., Clarke, R.E., Gordon, G.C. & Pfleger, B.F. Synthetic biology toolbox for controlling gene expression in the cyanobacterium Synechococcus sp. strain PCC 7002. ACS Synth. Biol. 4, 595–603 (2015).
Article CAS Google Scholar
Elmore, J.R., Furches, A., Wolff, G.N., Gorday, K. & Guss, A.M. Development of a high efficiency integration system and promoter library for rapid modification of Pseudomonas putida KT2440. Metab. Eng. Commun. 5, 1–8 (2017).
Article Google Scholar
Guiziou, S. et al. A part toolbox to tune genetic expression in Bacillus subtilis. Nucleic Acids Res. 44, 7495–7508 (2016).
CAS PubMed PubMed Central Google Scholar
Cardinale, S. & Arkin, A.P. Contextualizing context for synthetic biology—identifying causes of failure of synthetic biological systems. Biotechnol. J. 7, 856–866 (2012).
Article CAS Google Scholar
Temme, K., Hill, R., Segall-Shapiro, T.H., Moser, F. & Voigt, C.A. Modular control of multiple pathways using engineered orthogonal T7 polymerases. Nucleic Acids Res. 40, 8773–8781 (2012).
Article CAS Google Scholar
Kushwaha, M. & Salis, H.M. A portable expression resource for engineering cross-species genetic circuits and pathways. Nat. Commun. 6, 7832 (2015).
Article CAS Google Scholar
Gaida, S.M. et al. Expression of heterologous sigma factors enables functional screening of metagenomic and heterologous genomic libraries. Nat. Commun. 6, 7045 (2015).
Article Google Scholar
Sheth, R.U., Cabral, V., Chen, S.P. & Wang, H.H. Manipulating bacterial communities by in situ microbiome engineering. Trends Genet. 32, 189–200 (2016).
Article CAS Google Scholar
Kim, D. et al. Comparative analysis of regulatory elements between Escherichia coli and Klebsiella pneumoniae by genome-wide transcription start site profiling. PLoS Genet. 8, e1002867 (2012).
Article CAS Google Scholar
Boutard, M. et al. Global repositioning of transcription start sites in a plant-fermenting bacterium. Nat. Commun. 7, 13783 (2016).
Article CAS Google Scholar
Wurtzel, O. et al. The single-nucleotide resolution transcriptome of Pseudomonas aeruginosa grown in body temperature. PLoS Pathog. 8, e1002945 (2012).
Article Google Scholar
Torella, J.P. et al. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications. Nat. Protoc. 9, 2075–2089 (2014).
Article CAS Google Scholar
Sleight, S.C., Bartley, B.A., Lieviant, J.A. & Sauro, H.M. Designing and engineering evolutionary robust genetic circuits. J. Biol. Eng. 4, 12 (2010).
Article Google Scholar
Bailey, T.L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).
Article CAS Google Scholar
Ishihama, A. Functional modulation of Escherichia coli RNA polymerase. Annu. Rev. Microbiol. 54, 499–518 (2000).
Article CAS Google Scholar
Browning, D.F. & Busby, S.J. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2, 57–65 (2004).
Article CAS Google Scholar
Deutscher, M.P. Degradation of RNA in bacteria: comparison of mRNA and stable RNA. Nucleic Acids Res. 34, 659–666 (2006).
Article CAS Google Scholar
Caron, M.-P. Dual-acting riboswitch control of translation initiation and mRNA decay. Proc. Natl. Acad. Sci. USA 109, E3444–E3453 (2012).
Article CAS Google Scholar
Salis, H.M., Mirsky, E.A. & Voigt, C.A. Automated design of synthetic ribosome binding sites to control protein expression. Nat. Biotechnol. 27, 946–950 (2009).
Article CAS Google Scholar
Kong, W., Brovold, M., Koeneman, B.A., Clark-Curtiss, J. & Curtiss, R. III. Turning self-destructing Salmonella into a universal DNA vaccine delivery platform. Proc. Natl. Acad. Sci. USA 109, 19414–19419 (2012).
Article CAS Google Scholar
Weinstock, M.T., Hesek, E.D., Wilson, C.M. & Gibson, D.G. Vibrio natriegens as a fast-growing host for molecular biology. Nat. Methods 13, 849–851 (2016).
Article CAS Google Scholar
Kalinowski, J. et al. The complete Corynebacterium glutamicum ATCC 13032 genome sequence and its impact on the production of L-aspartate-derived amino acids and vitamins. J. Biotechnol. 104, 5–25 (2003).
Article CAS Google Scholar
Bikard, D. et al. Exploiting CRISPR-Cas nucleases to produce sequence-specific antimicrobials. Nat. Biotechnol. 32, 1146–1150 (2014).
Article CAS Google Scholar
Citorik, R.J., Mimee, M. & Lu, T.K. Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases. Nat. Biotechnol. 32, 1141–1145 (2014).
Article CAS Google Scholar
Gomaa, A.A. et al. Programmable removal of bacterial strains by use of genome-targeting CRISPR-Cas systems. MBio 5, e00928–13 (2014).
Article Google Scholar
Kotula, J.W. et al. Programmable bacteria detect and record an environmental signal in the mammalian gut. Proc. Natl. Acad. Sci. USA 111, 4838–4843 (2014).
Article CAS Google Scholar
Guérout-Fleury, A.M., Frandsen, N. & Stragier, P. Plasmids for ectopic integration in Bacillus subtilis. Gene 180, 57–61 (1996).
Article Google Scholar
Newman, J.R. & Fuqua, C. Broad-host-range expression vectors that carry the L-arabinose-inducible Escherichia coli araBAD promoter and the araC regulator. Gene 227, 197–203 (1999).
Article CAS Google Scholar
Pédelacq, J.D., Cabantous, S., Tran, T., Terwilliger, T.C. & Waldo, G.S. Engineering and characterization of a superfolder green fluorescent protein. Nat. Biotechnol. 24, 79–88 (2006).
Article Google Scholar
Markowitz, V.M. et al. IMG: the Integrated Microbial Genomes database and comparative analysis system. Nucleic Acids Res. 40, D115–D122 (2012).
Article CAS Google Scholar
LeProust, E.M. et al. Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic Acids Res. 38, 2522–2540 (2010).
Article CAS Google Scholar
van der Rest, M.E., Lange, C. & Molenaar, D. A heat shock following electroporation induces highly efficient transformation of Corynebacterium glutamicum with xenogeneic plasmid DNA. Appl. Microbiol. Biotechnol. 52, 541–545 (1999).
Article CAS Google Scholar
Jayaprakash, A.D., Jabado, O., Brown, B.D. & Sachidanandam, R. Identification and remediation of biases in the activity of RNA ligases in small-RNA deep sequencing. Nucleic Acids Res. 39, e141 (2011).
Article CAS Google Scholar
Goodman, D.B., Church, G.M. & Kosuri, S. Causes and effects of N-terminal codon bias in bacterial genes. Science 342, 475–479 (2013).
Article CAS Google Scholar
Mathews, D.H. RNA secondary structure analysis using RNAstructure. Curr. Protoc. Bioinformatics 46, 12.6.1–12.6.25 (2014).
Google Scholar

Download references

Acknowledgements

We thank members of the Wang lab for helpful discussions and feedback. H.H.W. acknowledges funding support from the NIH (1DP5OD009172-02, 1U01GM110714-01A1), NSF (MCB-1453219), Sloan Foundation (FR-2015-65795), DARPA (W911NF-15-2-0065), and ONR (N00014-15-1-2704). N.I.J. is supported by an NSF Graduate Research Fellowship (DGE-16-44869). S.S.Y. is supported by the National Research Foundation of Korea (NRF-2017R1A6A3A03003401). We also thank T. Seto for help with plasmid construction; A. Figueroa for assistance with cell sorting; H. Salis for helpful discussions regarding the RBS calculator; D.B. Goodman for discussions regarding FACS-seq; G.M. Church (Harvard Medical School, Boston, Massachusetts, USA) for access to OLS libraries; and D. Dubnau (Rutgers New Jersey Medical School, Newark, New Jersey, USA), S. Lory, and A. Rasouly (both at Harvard Medical School, Boston, Massachusetts, USA) for providing the BD3182 and PAO1 Δpsy2 strains.

Author information

Antonio L C Gomes
Present address: Department of Immunology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
Nathan I Johns and Antonio L C Gomes: These authors contributed equally to this work.

Authors and Affiliations

Department of Systems Biology, Columbia University Medical Center, New York, New York, USA
Nathan I Johns, Antonio L C Gomes, Sung Sun Yim, Tomasz Blazejewski & Harris H Wang
Integrated Program in Cellular, Molecular and Biomedical Studies, Columbia University Medical Center, New York, New York, USA
Nathan I Johns & Tomasz Blazejewski
School of Engineering and Applied Sciences, Columbia University, New York, New York, USA
Anthony Yang
Broad Institute, Cambridge, Massachusetts, USA
Christopher S Smillie & Eric J Alm
Department of Biological Engineering, MIT, Cambridge, Massachusetts, USA
Mark B Smith & Eric J Alm
Computational and Systems Biology Initiative, MIT, Cambridge, Massachusetts, USA
Eric J Alm
The Center for Microbiome Informatics and Therapeutics, MIT, Cambridge, Massachusetts, USA
Eric J Alm
Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, USA
Sriram Kosuri
UCLA–DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, California, USA
Sriram Kosuri
Molecular Biology Institute, University of California, Los Angeles, Los Angeles, California, USA
Sriram Kosuri
Department of Pathology and Cell Biology, Columbia University Medical Center, New York, New York, USA
Harris H Wang

Authors

Nathan I Johns
View author publications
You can also search for this author in PubMed Google Scholar
Antonio L C Gomes
View author publications
You can also search for this author in PubMed Google Scholar
Sung Sun Yim
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Yang
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Blazejewski
View author publications
You can also search for this author in PubMed Google Scholar
Christopher S Smillie
View author publications
You can also search for this author in PubMed Google Scholar
Mark B Smith
View author publications
You can also search for this author in PubMed Google Scholar
Eric J Alm
View author publications
You can also search for this author in PubMed Google Scholar
Sriram Kosuri
View author publications
You can also search for this author in PubMed Google Scholar
Harris H Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.I.J., A.L.C.G., C.S.S., M.B.S., E.J.A., S.K., and H.H.W. designed the study. N.I.J., S.S.Y., and H.H.W. performed the experiments. N.I.J., A.L.C.G., A.Y., T.B., and H.H.W. analyzed the data. N.I.J., A.L.C.G., and H.H.W. wrote the manuscript, with input from all other authors.

Corresponding author

Correspondence to Harris H Wang.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Metadata of the 184 donor genomes used to derive the regulatory sequences used in this study.

(a) genome size, (b) genomic GC content, (c) gram staining, (d) lifestyle, (e) number of regulatory sequences mined per genome, (f) the number of genomes per phylum, and (g) the 16S phylogenetic tree.

Supplementary Figure 2 Vector designs.

Vector maps for pNJ1, pNJ2.1, and pNJ3.1 used for expression measurements of metagenomic regulatory sequence library (RS) in E. coli, B. subtilis, and P. aeruginosa respectively and pNJ6.0, pNJ3.1, pNJ7 and pNJ8, which were used for RS241 library measurements in E. coli, B. subtilis, P. aeruginosa, S. enterica, V. natriegens and C. glutamicum.

Supplementary Figure 3 Replication experiments to validate method performance.

(a) Correlation of transcriptional measurements of RS library across two independent replicate cultures (>10 DNA counts across both replicates, n = 18,845) in E. coli performed on different days. (b) Correlation of transcriptional measurements of identical RSs with two different barcodes in E. coli (>10 DNA counts across both constructs, n = 2,273). Pearson correlation (r) is listed in each panel.

Supplementary Figure 4 Validations of gene expression measurements.

(a) Correlation of pooled RNA-seq measurements with individual RT-PCR data from isolate strains containing RS library members for three host species. (b) GFP fluorescence distributions of post-FACS RS library populations displayed as violin plots (n = 10,000 cells, mean value shown as horizontal bar). (c) Correlation of pooled FACS-seq measurements with individual flow cytometry measurements of isolate strains. Pearson correlation coefficients and sample sizes are listed for (r and n) listed in each subplot.

Supplementary Figure 5 Alternative reporter gene experiments.

Correlation between transcription (a) and translation (b) data measured using sfGFP and an alternate reporter mCherry. Sample sizes (n) and Pearson correlation coefficients (r) are listed in the lower right of each plot.

Supplementary Figure 6 Transcription start sites in three species.

Distribution of transcription start sites (TSSs) for active regulatory sequences containing one primary TSS with >70% of reads starting within +/- 5 bp. Most TSSs occur between 20-50 bp upstream of the start codon for B. subtilis, E. coli, and P. aeruginosa.

Supplementary Figure 7 Alternative-growth-condition transcription data.

(a) Transcription activity for 18,205 members of the RS library across multiple growth conditions in E. coli is clustered and shown as a heatmap. Transcription levels are log₂ (RNA/DNA) ratios normalized by the mean activity of control sequences (see Methods). (b) Ranked TSS locations of each RS measured in E. coli during LB exponential phase are shown, along with the TSS distribution (top panel) and the frequency of multiple TSSs (inset) of the RS library. (c) Frequency of matching TSS positions for RSs in LB and M9 growth media. Pearson correlation of 1 signifies perfectly matched TSS between conditions and -1 denoting no or anti-correlation. Intermediate values denote partial TSS matching. Example RSs with high, moderate, and no correlation in TSS positions in LB and M9 are shown in the inset (n = 18,205). (d) A subset of 100 robust RSs with condition-invariant transcription levels of different strengths (top panel) generated from a single TSS of different untranslated region (UTR) lengths (bottom panel) is provided as a useful community resource.

Supplementary Figure 8 Comparison of TSS data for regulatory sequences (RSs) across growth conditions in E. coli.

(a) A histogram of the distribution of all 10 pairwise comparisons of TSS position of regulatory sequences measured in 5 growth conditions (LB exponential growth phase, LB-exp; LB exponential with iron depletion, LB-Fe; LB exponential with high salt, LB-NaCl; LB stationary phase, LB-stat; M9 minimal media exponential phase, M9-exp) is shown (n = 18,205). Perfectly matched TSSs in two conditions have a Pearson correlation of 1, while an un-matched pair of TSSs has a correlation of -1. (b) A histogram of the mean TSS correlations (Pearson r) of all RSs across all pairwise conditions show almost half of RSs have the same TSS across all 5 conditions (n = 18,205).

Supplementary Figure 9 De novo motif search.

(a) Motif analysis of promoters binned by activity levels. The top two motifs identified by MEME for each recipient at the four activity bins (low, medium low, medium high, high) are shown. All motifs resembled the σ70 motif or its degenerate versions. Statistically non-significant motifs are displayed in gray color. Additional MEME motif outputs are not shown since none were significantly different from σ70-like motifs. (b) Transcriptional activity heatmap grouped by hierarchical clustering (n=395). Motif finding was performed to identify motifs across ten clusters. The corresponding motif for each cluster is indicated by colored circle. (c) Removal of regulatory sequences containing the σ70 motif from the dataset and repeating the analysis performed in a did not reveal additional non-σ70-like motifs (n=76). Statistically non-significant motifs (MEME E-value > 1e-2) are displayed in gray color in b and c.

Supplementary Figure 10 The σ⁷⁰ motif is the dominant factor governing transcriptional activity of horizontally acquired regulatory sequences.

(a) Pearson correlation of transcriptional activity versus promoter GC content (%GC), RNA structural stability (ΔG RNA), best σ70 match score (max(σ70)) and number of σ70 matches (n(σ70)) are displayed per recipient species. (b) Partial correlation displays activity versus variable by controlling to the other variables. Sample sizes (n) are 4314, 14809, and 17787 regulatory sequences for B. subtilis, E. coli, and P. aeruginosa respectively.

Supplementary Figure 11 Regulatory sequence translation levels determined by FACS-seq in B. subtilis, E. coli, and P. aeruginosa.

(a) The distribution of GFP fluorescence values of the regulatory sequence library in each recipient. (b) Translational activity of 8,898 regulatory sequences with measurable GFP fluorescence data across all three recipients. (c) Analysis of ribosome binding site sequence motifs in highly translated constructs. Motif logos were constructed using WebLogo v3.5.0. The genomic GC content of each species was used for background nucleotide frequency models and are listed in each subplot.

Supplementary Figure 12 Protein expression from Firmicute and Proteobacterial regulatory sequences.

Heatmap panels show the fraction of RS library distributed across bins of transcription and translation levels in three recipients (colored columns). Donor RSs from Firmicutes genomes are shown in (a) and from Proteobacteria genomes in (b). The top row of each heatmap subpanels use values normalized by the total number of regulatory sequences. The middle row use values normalized by each column bin corresponding to transcription windows. The bottom row use values normalized by each row bin corresponding to translation windows. Grey colored rows indicate data points with fewer than 10 RSs in total and insufficient for analysis.

Supplementary Figure 13 Cross-species and in silico comparisons of gene expression levels.

(a) Correlation of regulatory sequence activity in terms of transcription level and translation efficiency (calculated as the ratio of GFP protein levels and transcription levels) between recipient species. Each point corresponds to a single regulatory sequence that has measurable transcription and translation data. Pearson correlation coefficient (r) and statistical significance values (p) are shown for each subplot (n=212 for all six panels). (b) Correlation between calculated translation (TL) efficiency based on the RBS calculator and our measured translation efficiency across highly transcribed regulatory sequences (top 15%) in each recipient species (n = 581, 2276, and 2198 for B. subtilis, E. coli, and P. aeruginosa respectively).

Supplementary Figure 14 Regulatory activity of RS241 library in six bacterial species.

Regulatory sequences are sorted by activity (from high to low) per species by (a) transcription or (b) translation levels. Regulatory sequences are re-sorted by mean transcription levels (from low to high) across all species and plotted for (c) transcription and (d) translation levels. Transcriptional values were normalized with the highest expression construct having a value of 10⁶. Gray lines correspond to sequences where no data was available. Species names are abbreviated as: B. subtilis, B.s.; C. glutanicum, C.g.; P. aeruginosa, P.a.; V. natriegens, V.n.; S. enterica, S.e.; E. coli, E.c.

Supplementary Figure 15 Cross-species transcription and translation level correlations.

(a) Pairwise Pearson correlation of transcription (blue triangle) and translation (green triangle) activity profiles of the RS241 library across six host species. Species are arranged based their 16S phylogenetic similarity. Numbers in each box correspond to the Pearson correlation coefficients (n = 241). (b) Scatter plot showing each pairwise correlation described in (a).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–15

Life Sciences Reporting Summary

Supplementary Table 1

Regulatory sequence library metadata

Supplementary Table 2

Library expression data for B. subtilis, E. coli, and P. aeruginosa

Supplementary Table 3

Library expression data for E. coli in five growth conditions

Supplementary Table 4

RS241 library expression data in six species

Supplementary Data Set 1

Strains and materials used in this study

Rights and permissions

Reprints and permissions

About this article

Cite this article

Johns, N., Gomes, A., Yim, S. et al. Metagenomic mining of regulatory elements enables programmable species-selective gene expression. Nat Methods 15, 323–329 (2018). https://doi.org/10.1038/nmeth.4633

Download citation

Received: 20 November 2017
Accepted: 25 January 2018
Published: 19 March 2018
Issue Date: 01 May 2018
DOI: https://doi.org/10.1038/nmeth.4633

This article is cited by

mEnrich-seq: methylation-guided enrichment sequencing of bacterial taxa of interest from microbiome
- Lei Cao
- Yimeng Kong
- Gang Fang
Nature Methods (2024)
Genomically mined acoustic reporter genes for real-time in vivo monitoring of tumors and tumor-homing bacteria
- Robert C. Hurt
- Marjorie T. Buss
- Mikhail G. Shapiro
Nature Biotechnology (2023)
Deep flanking sequence engineering for efficient promoter design using DeepSEED
- Pengcheng Zhang
- Haochen Wang
- Xiaowo Wang
Nature Communications (2023)
Engineering living and regenerative fungal–bacterial biocomposite structures
- Ross M. McBee
- Matt Lucht
- Harris H. Wang
Nature Materials (2022)
Inducible plasmid copy number control for synthetic biology in commonly used E. coli strains
- Shivang Hina-Nilesh Joshi
- Chentao Yong
- Andras Gyorgy
Nature Communications (2022)