Genome-scale determination of 5´ and 3´ boundaries of RNA transcripts in Streptomyces genomes

Lee, Yongjae; Lee, Namil; Hwang, Soonkyu; Kim, Woori; Jeong, Yujin; Cho, Suhyung; Palsson, Bernhard O.; Cho, Byung-Kwan

doi:10.1038/s41597-020-00775-w

Download PDF

Data Descriptor
Open access
Published: 15 December 2020

Genome-scale determination of 5´ and 3´ boundaries of RNA transcripts in Streptomyces genomes

Yongjae Lee¹,
Namil Lee¹,
Soonkyu Hwang¹,
Woori Kim¹,
Yujin Jeong¹,
Suhyung Cho¹,
Bernhard O. Palsson ORCID: orcid.org/0000-0003-2357-6785^2,3,4 &
…
Byung-Kwan Cho ORCID: orcid.org/0000-0003-4788-4184^1,5

Scientific Data volume 7, Article number: 436 (2020) Cite this article

2145 Accesses
6 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Streptomyces species are gram-positive bacteria with GC-rich linear genomes and they serve as dominant reservoirs for producing clinically and industrially important secondary metabolites. Genome mining of Streptomyces revealed that each Streptomyces species typically encodes 20–50 secondary metabolite biosynthetic gene clusters (smBGCs), emphasizing their potential for novel compound discovery. Unfortunately, most of smBGCs are uncharacterized in terms of their products and regulation since they are silent under laboratory culture conditions. To translate the genomic potential of Streptomyces to practical applications, it is essential to understand the complex regulation of smBGC expression and to identify the underlying regulatory elements. To progress towards these goals, we applied two Next-Generation Sequencing methods, dRNA-Seq and Term-Seq, to industrially relevant Streptomyces species to reveal the 5´ and 3´ boundaries of RNA transcripts on a genome scale. This data provides a fundamental resource to aid our understanding of Streptomyces’ regulation of smBGC expression and to enhance their potential for secondary metabolite synthesis.

Measurement(s)	5´-ends of transcripts • 3´-ends of transcripts • RNA • TSS • transcription_termination_signal
Technology Type(s)	dRNA-Seq • Term-Seq • RNA sequencing
Factor Type(s)	Streptomyces growth phase
Sample Characteristic - Organism	Streptomyces avermitilis • Streptomyces clavuligerus • Streptomyces coelicolor • Streptomyces griseus • Streptomyces lividans • Streptomyces tsukubensis • Streptomyces venezuelae

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13259393

Thirty complete Streptomyces genome sequences for mining novel secondary metabolite biosynthetic gene clusters

Article Open access 13 February 2020

Transcriptome and translatome profiles of Streptomyces species in different growth phases

Article Open access 08 May 2020

Transcriptome profiles of Streptomyces clavuligerus strains producing different titers of clavulanic acid

Article Open access 16 November 2023

Background & Summary

Streptomyces species are gram-positive filamentous bacteria and hold a great importance for their ability to produce a wide range of clinically or industrially important secondary metabolites^1,2. During the middle 20th century, the number of available antibiotics rapidly increased and especially, more than 70% of the antibiotics from bacteria were discovered from Streptomyces species, emphasizing their importance as the dominant source of antimicrobial compounds³. However, the discovery of novel antibiotics rapidly decreased during the latter part of 20th century as research progress with Streptomyces species declined as reflected by a decreasing number of novel secondary metabolite discovered⁴. Fortunately, with the emergence of Next-Generation Sequencing (NGS) technique, the genome sequences of many Streptomyces species have been collected and increased the potential to produce novel secondary metabolites⁵. Computational prediction revealed that a single Streptomyces species typically possesses about 20–50 secondary metabolite biosynthetic gene clusters (smBGCs), and the great number of smBGCs in Streptomyces genomes encourages researchers to revisit these organisms to cope with the threat of emerging multi-drug resistant bacteria^6,7.

Despite their potential for the production of diverse secondary metabolites, most of the smBGCs have not been characterized in terms of their products and corresponding molecular functions, mainly due to the silent nature of the smBGCs under the laboratory culture conditions⁸. Since most secondary metabolites are not essential for growth and produced to respond to environmental stimuli, such as osmotic pressure or nutrient limitations or inter-species competition, the smBGCs are expected to be under tight and complex regulation^9,10,11. To utilize the genomic potential of Streptomyces, an understanding of the genetic regulatory mechanisms for activating smBGCs is crucial. Especially, understanding transcriptional regulatory mechanisms is important since the transcription is the first step of gene expression and diverse regulations take place in transcription^12,13. Here we report NGS datasets describing the differential RNA-Seq (dRNA-Seq) and Term-Seq of seven important Streptomyces species to broaden our understanding on transcriptional regulation of Streptomyces in genome-scale by providing the transcript boundary information (Fig. 1)^14,15,16,17.

The seven species studied include Streptomyces avermitilis, Streptomyces clavuligerus, Streptomyces coelicolor, Streptomyces griseus, Streptomyces lividans, Streptomyces tsukubaensis and Streptomyces venezuelae, widely known for their scientific, clinical and industrial importance. S. coelicolor is the most well-known model species and S. avermitilis, S. clavuligerus, S. griseus and S. tsukubaensis are known for the ability to produce the anthelmintic agent, avermectin, β-lactamase inhibitor, clavulanic acid, antibiotics, streptomycin, and immunosuppressive agent, FK506, respectively^{5,18,19,20,21}. S. lividans and S. venezuelae are majorly used for industrial applications for heterologous expression of proteins and the smBGCs of other Streptomyces, since the genetic backgrounds are suitable for heterologous expressions for the two organisms^22,23. Total RNAs of Streptomyces were pooled from four different growth phases, including early-exponential, transition, late-exponential and stationary phases, to cover genes expressed under starvation condition as well as genes involved in primary metabolism at the active growth (Fig. 1a)²⁴. dRNA-Seq reveals the transcription start sites (TSSs) of transcripts by differentiating the TSSs from the 5′-ends of processed transcripts. For dRNA-Seq, two libraries are constructed, one from the 5′-ends of unprocessed bacterial primary transcripts and the other from the 5′-ends of processed transcripts. By comparing the two libraries, TSSs can be differentiated from the processed 5′-ends. In contrast, Term-Seq captures the 3′-ends of transcripts, which lead to identification of the genuine transcription termination sites (TTSs) and processed 3′-ends²⁵.

From the TSSs determined from dRNA-Seq, the promoter sequences can be identified with the aid of computational motif discovery tools²⁶. In addition, TSS information enables to determine 5′-untranslated region (5′-UTR) of each gene in nucleotide resolution, which contains transcriptional or translational regulatory elements, such as the ribosome binding site (RBS), riboswitches and upstream open reading frames^15,27,28,29. Likewise, transcriptional terminator sequences and 3′-UTR can be determined from the 3′-end information of transcripts obtained from Term-Seq. With the aid of genome-wide transcriptome and translatome information which can be obtained from RNA-Seq and Ribo-Seq, respectively, the transcriptional and translational effect of each regulatory element, including the promoter sequence, RBS or transcription terminator sequence, can be evaluated. Furthermore, the determined regulatory elements can be utilized for improving the production of secondary metabolites in Streptomyces through synthetic biology approaches. The transcript boundary information obtained from dRNA-Seq and Term-Seq will serve as fundamental resources to understand the complex regulatory mechanisms in bacteria and improve the industrial applications.

Methods

Strain and culture condition

S. avermitilis MA-4680, S. clavuligerus ATCC27064, S. coelicolor M145, S. griseus NBRC13350, S. lividans TK24, S. tsukubaensis NBRC108819 and S. venezuelae ATCC15439 were used in this study. The mycelium of each Streptomyces was maintained in 25% glycerol at −80 °C. Cells were cultured in 50 mL R5− media with 8 g glass beads (3 ± 0.3 mm diameter) at 30 °C. The R5− medium consists of 5.73 g TES (pH 7.2), 103 g sucrose, 10 g glucose, 5 g yeast extract, 10.12 g MgCl₂∙6H₂O, 0.25 g K₂SO₄, 0.1 g casamino acids, 0.08 mg ZnCl₂, 0.4 mg FeCl₃∙6H₂O, 0.02 mg CuCl₂∙2H₂O, 0.02 mg MnCl₂∙4H₂O, 0.02 mg Na₂B₄O₇∙10H₂O, and 0.02 mg (NH₄)₆Mo₇O₂₄∙4H₂O in 1 L distilled water. Cell growth was determined by measuring optical density at 600 nm with biological triplicates. The cells were sampled at four different time points according to the growth profile of each strain, which were early-exponential (E), transition (T), late-exponential (L), and stationary (S) phases. The E, T, L, and S time points were 13, 17, 19.5, 33.5 h for S. avermitilis, 26, 80, 105.5, 125 h for S. clavuligerus, 12, 16, 20, 34 h for S. coelicolor, 15.5, 28, 32.5, 48.5 h for S. griseus, 9.5, 14, 16, 20 h for S. lividans, 13, 19, 23, 31 h for S. venezuelae, and 15, 18.5, 28, 48 h for S. tsukubaensis after inoculation, respectively (Fig. 1a). For NGS library preparation, cultures for each strain were inoculated in eight flasks as biological octuplicates and cells were harvested from two flasks for each growth phase as biological duplicates.

RNA extraction

After harvesting, the cells were washed with polysome buffer (20 mM Tris-HCl pH 7.5, 140 mM NaCl, 5 mM MgCl₂), and resuspended with lysis buffer (0.3 M sodium acetate pH 5.2, 10 mM EDTA, 1% Triton X-100). The cell suspension was frozen with liquid nitrogen, and then physically lysed by grinding using mortar and pestle. The cell lysate was centrifuged at 4 °C for 10 min at 16000 × g and the supernatant was saved and stored at −80 °C until used for RNA extraction. For RNA extraction, the supernatant was mixed with equal volume of phenol:chloroform:isoamyl alcohol = 25:24:1 solution. The mixture was then centrifuged and RNA was extracted from the upper aqueous phase with ethanol precipitation.

For Term-Seq of S. coelicolor and S. griseus, RNA was extracted by lysing cells with hot phenol. The harvested cells were resuspended with Sol 1 (25 mM Tris-HCl pH 8.0, 10 mM EDTA, 50 mM glucose, 2 mg/mL lysozyme) and incubated at 30 °C for 10 minutes. After incubation, the cells were centrifuged down and the supernatant was discarded. The cell pellet was resuspended with AE-SDS (50 mM sodium acetate pH 5.2, 10 mM EDTA, 1% sodium dodecyl sulfate) and the suspension was mixed with equal volume of phenol:chloroform = 5:1 solution. Cells were lysed by incubating at 65 °C for 5 min and centrifuged. RNA was extracted from the upper aqueous phase with isopropanol precipitation, and genomic DNA aggregate formed upon addition of isopropanol was removed before precipitation.

To remove any DNA contamination, the RNA samples were treated with DNase I (New England Biolabs, Ipswich, MA, USA).

dRNA-Seq library preparation

The four DNase I treated RNA samples from the four growth phases were mixed equally to obtain one 10 μg RNA mixture and a total of two RNA mixtures were prepared from the eight RNA samples as the biological duplicates for each strain. The rRNA in the RNA mixture was depleted using Ribo-Zero rRNA Removal Kit for Bacteria (Epicentre, Madison, WI, USA). The rRNA-depleted RNA was incubated in 1 × RNA 5′ polyphosphatase (TAP; Epicentre) reaction buffer and 1 U of SUPERase-In (Invitrogen, Carlsbad, CA, USA) at 37 °C for 1 h, with or without TAP for TAP( + ) or TAP(−) libraries, respectively. The reaction was cleaned up with ethanol precipitation and 5 pmol of 5′ RNA adaptor (5′-ACACUCUUUCCCUACACGACGCUCUUCCGAUCU-3′) was ligated to the purified RNA using T4 RNA ligase (Thermo Fisher Scientific, Waltham, MA, USA) by incubating at 37 °C for 90 min in 1 × RNA ligase buffer and 0.1 mg/mL BSA. The ligation product was then purified using Agencourt AMPure XP beads (Beckman Coulter, Brea, CA, USA) according to the manufacturer’s instructions. The purified product was reverse-transcribed with SuperScript III Reverse Transcriptase (Invitrogen) according to the manufacturer’s instructions and purified using Agencourt AMPure XP beads. The purified cDNA was amplified and indexed using Phusion High-Fidelity DNA Polymerase (Thermo Fisher Scientific) for Illumina sequencing. The amplification step was monitored using a CFX96 Real-Time PCR Detection System (Bio-Rad Laboratories, Hercules, CA, USA) and stopped before the PCR reaction was fully saturated. Finally, the amplified library was purified using Agencourt AMPure XP beads.

Term-seq library preparation

Term-Seq libraries for six species except S. coelicolor were prepared as previously described^15,17. The equal amounts of DNase I-treated RNA from the sampling time points were mixed and used for the input of Term-Seq library construction. The RNA was treated with Ribo-Zero rRNA Removal Kit for Bacteria (Epicentre) to deplete rRNA. The resulting 500~900 ng of rRNA-depleted RNA was mixed with 1 μL of 150 μM amino-blocked DNA adaptor (5′-p-NNAGATCGGAAGAGCGTCGTGT-3′), 2.5 μL of 10 × T4 RNA ligase 1 buffer, 2.5 μL of 10 mM ATP, 2 μL of DMSO, 9.5 μL of 50% PEG8000, and 2.5 μL of T4 RNA ligase 1 (New England BioLabs). The mixture was incubated at 23 °C for 2.5 h and reaction was cleaned-up using Agencourt AMPure XP beads. The adaptor ligated RNA was then fragmented by incubating at 72 °C for 90 seconds in fragmentation buffer (Ambion, Inc, Austin, TX, USA). The fragmentation reaction was cleaned-up using Agencourt AMPure XP beads. The fragmented RNA (8 μL in total) was reverse transcribed with SuperScript III Reverse Transcriptase using 1 μL of 10 μM reverse transcription primer (5′-TCTACACTCTTTCCCTACACGACGCTCTTC-3′) according to the manufacturer’s instructions. The cDNA was then purified with Agencourt AMPure XP beads. Another amino-blocked adaptor with different sequence (5′-p-NNAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC-3′) was ligated to the cDNA with increased incubation time (8 h). The ligation product was purified using Agencourt AMPure XP beads and indexed by PCR with Phusion High-Fidelity DNA Polymerase using forward (5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCT-3′) and reverse (5′-CAAGCAGAAGACGGCATACGAGATNNNNNN (6 nt index) GTGACTGGAGTTCAGAC-3′) primers. The PCR reaction was monitored using a CFX96 Real-Time PCR Detection System and stopped before the PCR reaction was fully saturated. The PCR product was purified with Agencourt AMPure XP beads.

For S. coelicolor, 1 μg of the total RNA instead of rRNA depleted RNA was ligated with 1 μL of 150 μM amino-blocked DNA adaptor (5′-p-NNAGATCGGAAGAGCGTCGTGT-3′) as described above. After ligation, rRNA was removed by using Hybridase™ Thermostable RNase H (Lucigen Corporation, Middleton, WI, USA). 13.5 μL of the purified ligation product was mixed with 1.5 μL of 10 × DNase I Reaction Buffer (New England BioLabs), 15 μL of Hybridase complement buffer (90 mM Tris-HCl pH 7.5, 200 mM KCl), 1 μL of anti-rRNA oligo mix (detailed composition is available in Figshare) and 2 μL of 50 mM MgCl₂ and incubated in thermal cycler (heat to 95 °C, cool down to 65 °C)³⁰. 2 μL of Hybridase™ Thermostable RNase H was added to the mixture and the mixture was incubated at 65 °C for 20 minutes, 90 °C for 1 second and 65 ^oC for 10 minutes. rRNA depletion reaction was cleaned up using RNA Clean & Concentrator Kits (Zymo Research, Irvine, CA, USA) and the product was processed according to the remaining procedures as described above.

High-throughput sequencing and data processing

All libraries were sequenced using either Illumina MiSeq or Illumina HiSeq. 2500 platform with either 1 × 100 bp (dRNA-Seq) or 1 × 50 bp (Term-Seq) read length except the dRNA-Seq of S. tsukubaensis. For the dRNA-Seq of S. tsukubaensis, both TAP(+) libraries and TAP(−) libraries were sequenced using Illumina MiSeq platform with 1 × 150 bp read length. The reads were processed using CLC Genomics Workbench. The raw reads were first mapped to phiX sequence, which is used in Illumina sequencing platform for quality control. The detailed mapping parameters are as follow. Mismatch cost: 2; Insertion cost: 3; Deletion cost: 3; Length fraction: 0.9; Similarity fraction: 0.9; Map randomly for non-specific matches. After mapping to phiX sequence, unmapped reads were collected and trimmed to remove adaptor sequences, short reads and low quality reads. The detailed parameters are as follow. Quality score limit: 0.05; Maximum number of ambiguities: 2; Remove adaptors; Discard read lengths below 15. For Term-Seq, two nucleotides at both ends were removed since the adaptors include random 2 nucleotides. The trimmed reads were mapped to the available reference genomes (Accession numbers: BA000030 for S. avermitilis, CP027858 and CP027859 for S. clavuligerus, NC_003888 for S. coelicolor, NC_010572 for S. griseus, CP009124 for S. lividans, CP020700 for S. tsukubaensis, CP059991 for S. venezuelae) with same parameters for phiX mapping, except the non-specific match handling (non-specific matches were discarded). After mapping to reference genomes, the directions of mapped reads of Term-Seq were inverted since the sequencing output comes in reverse direction.

Identification of read count enriched positions

To determine the read count enriched peak positions where represent possible TSSs for dRNA-Seq or TTSs for Term-Seq, the read count enrichment to a specific position was represented with the z-score of the read count at the specific position as previously described³¹. The detailed calculation is as follow.

$$Z\left(x\right)=\frac{r\left(x\right)-m\left(C\left(x\right)\right)}{\sigma (C(x))}$$

Z(x) is the modified z-score at position x, r(x) is the read count of position x. The read counts were determined for 5′-ends of mapped reads for dRNA-Seq and 3′-ends of mapped reads for Term-Seq. m(C(x)) and σ(C(x)) are the mean and standard deviation of read counts of other positions, C(x), near the position x. C(x) is the set of positions y, satisfying 2 < |x − y| ≤ 50. The z-score of each position was first calculated separately for biological replicates and then averaged. Note that the position x, where m(C(x)) ≤ 0.25 in any of the biological replicates was ignored. For dRNA-Seq, the z-scores were calculated only for the positions of TAP(+) libraries, where normalized read counts are more than two-fold higher compared to the normalized read counts of the same positions in TAP(−) libraries. Finally, positions with z-scores higher than 12 were retained and if there are multiple positions within 3 nt distance, the positions with less z-scores were discarded.

Data Records

For dRNA-Seq, raw read FASTQ files of three species (S. avermitilis, S. clavuligerus, S. tsukubaensis) were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive under the accession number SRP158023³², SRP188290³³, and SRP103795³⁴, respectively. The dRNA-Seq raw read FASTQ files of S. griseus, S. lividans and S. venezuelae were deposited in the European Nucleotide Archive (ENA) under the study accession number PRJEB40918³⁵, PRJEB31507³⁶ and PRJEB36379³⁷, respectively. For Term-Seq of S. avermitilis and S. lividans, raw read FASTQ files were deposited under the same accession as the dRNA-Seq. For S. clavuligerus, the Term-Seq raw read FASTQ files were also deposited in the NCBI Sequence Read Archive under the accession number SRX6937123³⁸ and SRX6937124³⁹. For S. coelicolor and S. griseus, the Term-Seq raw read FASTQ files were deposited under the same accession as the dRNA-Seq of S. griseus. The Term-Seq raw read FASTQ files of S. tsukubaensis and S. venezuelae were deposited in the European Nucleotide Archive (ENA) under the study accession number PRJEB36379³⁷.

The RNA-Seq data for the six Streptomyces species, S. avermitilis, S. clavuligerus, S. coelicolor, S. lividans, S. tsukubaensis and S. venezuelae, were retrieved from the accession number SRP158023³², SRP188290³³, SRP058830⁴⁰, PRJEB31507³⁶, SRP103795³⁴ and PRJEB34219⁴¹, respectively. And the RNA-Seq data for S. griseus were deposited under the same accession as the dRNA-Seq of S. griseus and Term-Seq of S. coelicolor and S. griseus. The predicted TSSs and TTSs along with the utilized python scripts were deposited in Figshare³⁰. The brief summary of smBGC information and the detailed composition of anti-rRNA oligo mix utilized for depletion of rRNA of S. coelicolor were also uploaded in Figshare³⁰.

Technical Validation

Evaluation of sequencing results

A total of 38 NGS libraries, including 24 dRNA-Seq libraries of S. avermitilis, S. clavuligerus, S. griseus, S. lividans, S. tsukubaensis and S. venezuelae, and 14 Term-Seq libraries of S. avermitilis, S. clavuligerus, S. coelicolor, S. griseus, S. lividans, S. tsukubaensis and S. venezuelae has been generated covering four different growth phases with biological replicates (dRNA-Seq data of S. coelicolor covering more diverse culture condition is available in the previous study performed by our group) (Fig. 1)²⁴. The sequencing resulted in 4.97–26.60 and 3.47–16.1 million reads per library for dRNA-Seq and Term-Seq, respectively, after removing the phiX mapped reads (Tables 1 and 2). The retained reads were trimmed to remove adaptor sequences and discard short and low-quality reads. After trimming, the retained reads were subject to sequencing quality control, in terms of the Phred quality score⁴². Most reads showed average Phred quality score around 30–40, representing that the base-calling error probabilities in NGS runs are lower than 10⁻³ (Fig. 2a, b). After evaluating the quality of trimmed reads, the reads were mapped to the reference, resulting in 59.39% mapped reads for dRNA-Seq and 75.00% mapped reads for Term-Seq in average.

Table 1 Sequencing statistics of dRNA-Seq.

Full size table

Table 2 Sequencing statistics of Term-Seq.

Full size table

Assessment of sequencing datasets

For technical validation of dRNA-Seq and Term-Seq results, the correspondence of read count enriched positions in dRNA-Seq and Term-Seq to increment and decrement of RNA-Seq profiles, respectively, was evaluated. The dRNA-Seq and Term-Seq read count enriched positions were determined for each Streptomyces, which highly likely represent the in vivo TSSs and TTSs, respectively (refer to Methods for detailed information about determination of read count enriched positions)³¹. In average, about 525 potential TSSs and 1285 potential TTSs were determined for each species, and about 7% of the predicted TSSs and 8% of the predicted TTSs were found in the smBGC regions (the smBGCs for each Streptomyces species were predicted using antiSMASH)⁴³. The determined TSS and TTS information and smBGC information are available at Figshare³⁰. Then, the RNA-Seq read density near the potential TSSs and TTSs was calculated (the RNA-Seq data were obtained from same culture conditions)^{32,33,34,36,41}. Across the TSSs predicted from dRNA-Seq, RNA-Seq read density drastically increased for all the six Streptomyces species in four growth phases, indicating that the TSSs were successfully captured from dRNA-Seq (Fig. 3a). Likewise, RNA-Seq read density drastically decreased across the TTSs predicted from Term-Seq, indicating that the TTSs were successfully captured from Term-Seq for all the seven Streptomyces species (Fig. 3b).

Code availability

Read count enriched positions and the RNA-Seq read density across the positions were determined using two source codes in Python (version 3.5.2) programming language, which are publicly available in Figshare³⁰.

References

Bérdy, J. Bioactive microbial metabolites. J Antibiot 58, 1–26, https://doi.org/10.1038/ja.2005.1 (2005).
Article Google Scholar
Demain, A. L. Pharmaceutically active secondary metabolites of microorganisms. Appl Microbiol Biotechnol 52, 455–463, https://doi.org/10.1007/s002530051546 (1999).
Article CAS PubMed Google Scholar
Demain, A. L. From natural products discovery to commercialization: a success story. J Ind Microbiol Biotechnol 33, 486–495, https://doi.org/10.1007/s10295-005-0076-x (2006).
Article CAS PubMed Google Scholar
Silver, L. L. Challenges of antibacterial discovery. Clin Microbiol Rev 24, 71–109, https://doi.org/10.1128/CMR.00030-10 (2011).
Article CAS PubMed PubMed Central Google Scholar
Bentley, S. D. et al. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417, 141–147, https://doi.org/10.1038/417141a (2002).
Article ADS PubMed Google Scholar
Nett, M., Ikeda, H. & Moore, B. S. Genomic basis for natural product biosynthetic diversity in the actinomycetes. Nat Prod Rep 26, 1362–1384, https://doi.org/10.1039/b817069j (2009).
Article CAS PubMed PubMed Central Google Scholar
Ventola, C. L. The antibiotic resistance crisis: part 1: causes and threats. P T 40, 277–283 (2015).
PubMed PubMed Central Google Scholar
Challis, G. L. & Hopwood, D. A. Synergy and contingency as driving forces for the evolution of multiple secondary metabolite production by Streptomyces species. Proc Natl Acad Sci USA 100(Suppl 2), 14555–14561, https://doi.org/10.1073/pnas.1934677100 (2003).
Article ADS CAS PubMed PubMed Central Google Scholar
Bibb, M. J. Regulation of secondary metabolism in streptomycetes. Curr Opin Microbiol 8, 208–215, https://doi.org/10.1016/j.mib.2005.02.016 (2005).
Article CAS PubMed Google Scholar
Bursy, J. et al. Synthesis and uptake of the compatible solutes ectoine and 5-hydroxyectoine by Streptomyces coelicolor A3(2) in response to salt and heat stresses. Appl Environ Microbiol 74, 7286–7296, https://doi.org/10.1128/AEM.00768-08 (2008).
Article CAS PubMed PubMed Central Google Scholar
Lee, N. et al. Iron competition triggers antibiotic biosynthesis in Streptomyces coelicolor during coculture with Myxococcus xanthus. ISME J, https://doi.org/10.1038/s41396-020-0594-6 (2020).
Bervoets, I. & Charlier, D. Diversity, versatility and complexity of bacterial gene regulation mechanisms: opportunities and drawbacks for applications in synthetic biology. FEMS Microbiol Rev 43, 304–339, https://doi.org/10.1093/femsre/fuz001 (2019).
Article CAS PubMed PubMed Central Google Scholar
Browning, D. F. & Busby, S. J. The regulation of bacterial transcription initiation. Nat Rev Microbiol 2, 57–65, https://doi.org/10.1038/nrmicro787 (2004).
Article CAS PubMed Google Scholar
Cho, B. K. et al. The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol 27, 1043–1049, https://doi.org/10.1038/nbt.1582 (2009).
Article CAS PubMed Google Scholar
Dar, D. et al. Term-seq reveals abundant ribo-regulation of antibiotics resistance in bacteria. Science 352, aad9822, https://doi.org/10.1126/science.aad9822 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hwang, S. et al. Primary transcriptome and translatome analysis determines transcriptional and translational regulatory elements encoded in the Streptomyces clavuligerus genome. Nucleic Acids Res 47, 6114–6129, https://doi.org/10.1093/nar/gkz471 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lee, Y. et al. The Transcription Unit Architecture of Streptomyces lividans TK24. Front Microbiol 10, 2074, https://doi.org/10.3389/fmicb.2019.02074 (2019).
Article PubMed PubMed Central Google Scholar
Burg, R. W. et al. Avermectins, new family of potent anthelmintic agents: producing organism and fermentation. Antimicrob Agents Chemother 15, 361–367, https://doi.org/10.1128/aac.15.3.361 (1979).
Article CAS PubMed PubMed Central Google Scholar
Paradkar, A. Clavulanic acid production by Streptomyces clavuligerus: biogenesis, regulation and strain improvement. J Antibiot (Tokyo) 66, 411–420, https://doi.org/10.1038/ja.2013.26 (2013).
Article CAS Google Scholar
Barreiro, C. et al. Draft genome of Streptomyces tsukubaensis NRRL 18488, the producer of the clinically important immunosuppressant tacrolimus (FK506). J Bacteriol 194, 3756–3757, https://doi.org/10.1128/JB.00692-12 (2012).
Article CAS PubMed PubMed Central Google Scholar
Waksman, S. A. Streptomycin: background, isolation, properties, and utilization. Science 118, 259–266, https://doi.org/10.1126/science.118.3062.259 (1953).
Article ADS CAS PubMed Google Scholar
Myronovskyi, M. & Luzhetskyy, A. Heterologous production of small molecules in the optimized Streptomyces hosts. Nat Prod Rep 36, 1281–1294, https://doi.org/10.1039/c9np00023b (2019).
Article CAS PubMed Google Scholar
Anné, J., Vrancken, K. & Van Mellaert, L. Van Impe, J. & Bernaerts, K. Protein secretion biotechnology in Gram-positive bacteria with special emphasis on Streptomyces lividans. Biochim Biophys Acta 1843, 1750–1761, https://doi.org/10.1016/j.bbamcr.2013.12.023 (2014).
Article CAS PubMed Google Scholar
Jeong, Y. et al. The dynamic transcriptional and translational landscape of the model antibiotic producer Streptomyces coelicolor A3(2). Nat Commun 7, 11605, https://doi.org/10.1038/ncomms11605 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Dar, D. & Sorek, R. High-resolution RNA 3′-ends mapping of bacterial Rho-dependent transcripts. Nucleic Acids Res 46, 6797–6805, https://doi.org/10.1093/nar/gky274 (2018).
Article CAS PubMed PubMed Central Google Scholar
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37, W202–208, https://doi.org/10.1093/nar/gkp335 (2009).
Article CAS PubMed PubMed Central Google Scholar
Shine, J. & Dalgarno, L. The 3′-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc Natl Acad Sci USA 71, 1342–1346, https://doi.org/10.1073/pnas.71.4.1342 (1974).
Article ADS CAS PubMed PubMed Central Google Scholar
Morris, D. R. & Geballe, A. P. Upstream open reading frames as regulators of mRNA translation. Mol Cell Biol 20, 8635–8642, https://doi.org/10.1128/mcb.20.23.8635-8642.2000 (2000).
Article CAS PubMed PubMed Central Google Scholar
Garst, A. D., Edwards, A. L. & Batey, R. T. Riboswitches: structures and mechanisms. Cold Spring Harb Perspect Biol 3, https://doi.org/10.1101/cshperspect.a003533 (2011).
Lee, Y. et al. Genome-scale determination of 5′ and 3′ boundaries of RNA transcripts in Streptomyces genomes. figshare https://doi.org/10.6084/m9.figshare.c.5044730 (2020).
Lalanne, J. B. et al. Evolutionary Convergence of Pathway-Specific Enzyme Expression Stoichiometry. Cell 173, 749–761 e738, https://doi.org/10.1016/j.cell.2018.03.007 (2018).
Article CAS PubMed PubMed Central Google Scholar
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRP158023 (2020).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRP188290 (2019).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRP103795 (2019).
European Nucleotide Archive https://identifiers.org/ena.embl:PRJEB40918 (2020).
European Nucleotide Archive https://identifiers.org/ena.embl:PRJEB31507 (2019).
European Nucleotide Archive https://identifiers.org/ena.embl:PRJEB36379 (2020).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRX6937123 (2020).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRX6937124 (2020).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRP058830 (2016).
European Nucleotide Archive https://identifiers.org/ena.embl:PRJEB34219 (2019).
Ewing, B., Hillier, L., Wendl, M. C. & Green, P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 8, 175–185, https://doi.org/10.1101/gr.8.3.175 (1998).
Article CAS PubMed Google Scholar
Blin, K. et al. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res 47, W81–W87, https://doi.org/10.1093/nar/gkz310 (2019).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by Bio & Medical Technology Development Program (2018M3A9F3079664 to B.-K.C.) through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (MSIT). This work was also supported by a grant from the Novo Nordisk Foundation (NNF10CC1016517 to B.O.P).

Author information

Authors and Affiliations

Department of Biological Sciences and KI for the BioCentury, Korea Advanced Institute of Science and Technology, Daejeon, 34141, Republic of Korea
Yongjae Lee, Namil Lee, Soonkyu Hwang, Woori Kim, Yujin Jeong, Suhyung Cho & Byung-Kwan Cho
Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
Bernhard O. Palsson
Department of Pediatrics, University of California San Diego, La Jolla, CA, 92093, USA
Bernhard O. Palsson
Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, 2800, Denmark
Bernhard O. Palsson
Intelligent Synthetic Biology Center, Daejeon, 34141, Republic of Korea
Byung-Kwan Cho

Authors

Yongjae Lee
View author publications
You can also search for this author in PubMed Google Scholar
Namil Lee
View author publications
You can also search for this author in PubMed Google Scholar
Soonkyu Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Woori Kim
View author publications
You can also search for this author in PubMed Google Scholar
Yujin Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Suhyung Cho
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard O. Palsson
View author publications
You can also search for this author in PubMed Google Scholar
Byung-Kwan Cho
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

B.-K.C. designed the study. Y.L., N.L., S.H., W.K. and Y.J. performed the experiments. Y.L., S.C., and B.-K.C. performed data analysis. Y.L., N.L., S.C., B.O.P., and B.-K.C. wrote the manuscript.

Corresponding author

Correspondence to Byung-Kwan Cho.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and permissions

About this article

Cite this article

Lee, Y., Lee, N., Hwang, S. et al. Genome-scale determination of 5´ and 3´ boundaries of RNA transcripts in Streptomyces genomes. Sci Data 7, 436 (2020). https://doi.org/10.1038/s41597-020-00775-w

Download citation

Received: 06 July 2020
Accepted: 19 November 2020
Published: 15 December 2020
DOI: https://doi.org/10.1038/s41597-020-00775-w

This article is cited by

Genome-scale analysis of genetic regulatory elements in Streptomyces avermitilis MA-4680 using transcript boundary information
- Yongjae Lee
- Namil Lee
- Byung-Kwan Cho
BMC Genomics (2022)