Massively parallel RNA device engineering in mammalian cells with RNA-Seq

Synthetic RNA-based genetic devices dynamically control a wide range of gene-regulatory processes across diverse cell types. However, the limited throughput of quantitative assays in mammalian cells has hindered fast iteration and interrogation of sequence space needed to identify new RNA devices. Here we report developing a quantitative, rapid and high-throughput mammalian cell-based RNA-Seq assay to efficiently engineer RNA devices. We identify new ribozyme-based RNA devices that respond to theophylline, hypoxanthine, cyclic-di-GMP, and folinic acid from libraries of ~22,700 sequences in total. The small molecule responsive devices exhibit low basal expression and high activation ratios, significantly expanding our toolset of highly functional ribozyme switches. The large datasets obtained further provide conserved sequence and structure motifs that may be used for rationally guided design. The RNA-Seq approach offers a generally applicable strategy for developing broad classes of RNA devices, thereby advancing the engineering of genetic devices for mammalian systems.


Supplementary Figures
Supplementary Figure 1. RNA-Seq quantitatively measures raw RNA and DNA read counts for a control library of 256 unique sequences. (a) A schematic illustrating the mechanism of reverse transcription and primer binding sites on the ribozyme switch sequence followed by library barcoding and Illumina adapter sequence PCR. UCI, unique coverage index. (b) A schematic showing the non-cleaving ribozyme library, which consists of a mutated ribozyme core to prevent self-cleavage and a randomized loop II.  Replicate R 2 cutoff @ 5 DNA reads cutoff @ 10 DNA reads cutoff @ 25 DNA reads cutoff @ 50 DNA reads cutoff @ 100 DNA reads cutoff @ 200 DNA reads cutoff @ 400 DNA reads Supplementary Figure 4. Quantitative FACS-Seq for assaying protein expression levels as a result of ribozyme switch regulatory activity (a) A schematic illustrating the mechanism by which ribozyme switches achieve conditional gene expression regulation, indicating that protein expression levels resulting from ribozyme switch activity can be assayed with FACS-Seq. (b) FACS-seq workflow for measuring differential protein expression levels associated with a ribozyme switch library.    . Cells are transfected with the indicated folinic acid ribozyme switches and incubated with 0 or 6 mM folinic acid for 48 hours. Labels above bars refer to activation ratio; first row are results in a HEK293T cell line that stably expresses SLC46A1, the second row are results in a parental HEK293T cell line. Error bars correspond to standard deviation of two biological replicates. Asterisks indicate the Benjamini-Hochberg positive false discovery rate of switching significance, using p-values from unpaired, one-tailed t-test, * 5%, ** 0.5%, *** 0.05%, **** 0.005%. sTRSV is the wild-type hammerhead ribozyme; sTRSVctl is a non-cleaving mutant of sTRSV. Filled circles are individual replicate data points.  Activation ratio of mCherry/BFP is indicated above each set of bars. sTRSV is the wild-type hammerhead ribozyme; sTRSVctl is a non-cleaving mutant of sTRSV; asterisks indicate the Benjamini-Hochberg positive false discovery rate of switching significance, using p-values from unpaired, one-tailed t-test, * 5%, ** 0.5%, *** 0.05%, **** 0.005%; error bars indicate standard deviation of two biological replicates. Filled circles are individual replicate data points.
a. b. (a) Histogram of standardized, normalized RNA read counts from an RNA-Seq assay for N5 and N6 loop II sequences of the xanthine ribozyme switch library. Inset zooms into the sequences with the lowest normalized RNA read counts. (b) Activation ratios of normalized RNA read counts from an RNA-Seq assay for N5 and N6 loop II sequences of the xanthine ribozyme switch library. Inset zooms into the sequences with the highest activation ratios.
(c) Histogram of standardized, normalized RNA read counts from an RNA-Seq assay for N5 and N6 loop II sequences of the folinic acid ribozyme switch library. Inset zooms into the sequences with the lowest normalized RNA read counts.
(d) Activation ratios of normalized RNA read counts from an RNA-Seq assay for N5 and N6 loop II sequences of the folinic acid ribozyme switch library. Inset zooms into the sequences with the highest activation ratios.
(e) Histogram of standardized, normalized RNA read counts from an RNA-Seq assay for N5 and N6 loop II sequences of the cyclic di-GMP-II ribozyme switch library. Inset zooms into the sequences with the lowest normalized RNA read counts.
(f) Activation ratios of normalized RNA read counts from an RNA-Seq assay for N5 and N6 loop II sequences of the cyclic di-GMP-II ribozyme switch library. Inset zooms into the sequences with the highest activation ratios.

Supplementary Tables
Supplementary Table 1

Supplementary Note 1. FACS-Seq noise
Sequencing read depth is typically correlated with data quality. The FACS-Seq and RNA-Seq assays had the same sequencing coverage; however, the data quality was poorer for FACS-Seq than RNA-Seq (Figure 2e, Figure 3c). For both FACS-Seq and RNA-Seq, sequences were filtered with a cutoff of 100 reads under each ligand and replicate condition, which was empirically determined to result in good replicate agreement for RNA-Seq (Supplementary