Cell-specific regulation of gene expression using splicing-dependent frameshifting

Precise and reliable cell-specific gene delivery remains technically challenging. Here we report a splicing-based approach for controlling gene expression whereby separate translational reading frames are coupled to the inclusion or exclusion of mutated, frameshifting cell-specific alternative exons. Candidate exons are identified by analyzing thousands of publicly available RNA sequencing datasets and filtering by cell specificity, conservation, and local intron length. This method, which we denote splicing-linked expression design (SLED), can be combined in a Boolean manner with existing techniques such as minipromoters and viral capsids. SLED can use strong constitutive promoters, without sacrificing precision, by decoupling the tradeoff between promoter strength and selectivity. AAV-packaged SLED vectors can selectively deliver fluorescent reporters and calcium indicators to various neuronal subtypes in vivo. We also demonstrate gene therapy utility by creating SLED vectors that can target PRPH2 and SF3B1 mutations. The flexibility of SLED technology enables creative avenues for basic and translational research.

Sample sizes were chosen with a minimum of n=3 if samples represent averages across large cell populations or tissues (e.g. retinas for subretinal injections or independent FACS experiments of thousands of cells). For SLED ratio quantifications from microscopy images, a minimum of n=50 cells were counted across sections, although more cells were counted if available in the field of view. No prior sample size calculation was performed before measurement. Our choice of sample size is adequate for this study given that the strong specificity of SLED vectors led to strong statistical validation.
None SLED vectors were repeated across multiple species to test whether specificity could be replicated in different organisms. No variation in specificity was observed between multiple injections or transfections for a given experimental condition. Experiments were performed at least n=3 times across multiple in vitro transfections or multiple AAV injections of SLED vectors. Independent replicates were successful except in cases of technical error (e.g. mistargeted surgical injections, loss of cell culture viability, etc). In these cases, experiments were again performed once technical errors were corrected.
Subretinal injections of SLED vectors were performed in a blinded and randomized manner. Analyses of ERG and OCT data were analyzed in a double-blinded manner. Randomization was not necessary for testing SLED vectors, given that randomization is inherent to the tissues tested. Targeted cell types and off-target cell types exist in the same tissue sample and are both transduced by AAV, thus providing a randomized internal control.
The identity of the specific viruses used for subretinal injections were blind to the experimenter performing injections by another member of the lab. Mice only labeled by a generic identifier were then sent to CCF for ERG analysis. Data was unblinded after ERG analysis was complete and results were shared (prior to unblinding).
Briefly describe the study type including whether data are quantitative, qualitative, or mixed-methods (e.g. qualitative cross-sectional, quantitative experimental, mixed-methods case study).
State the research sample (e.g. Harvard university undergraduates, villagers in rural India) and provide relevant demographic information (e.g. age, sex) and indicate whether the sample is representative. Provide a rationale for the study sample chosen. For studies involving existing datasets, please describe the dataset and source.
Describe the sampling procedure (e.g. random, snowball, stratified, convenience). Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient. For qualitative data, please indicate whether data saturation was considered, and what criteria were used to decide that no further sampling was needed.
Provide details about the data collection procedure, including the instruments or devices used to record the data (e.g. pen and paper, computer, eye tracker, video or audio equipment) whether anyone was present besides the participant(s) and the researcher, and whether the researcher was blind to experimental condition and/or the study hypothesis during data collection.
Indicate the start and stop dates of data collection. If there is a gap between collection periods, state the dates for each sample cohort.
If no data were excluded from the analyses, state so OR if data were excluded, provide the exact number of exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.
State how many participants dropped out/declined participation and the reason(s) given OR provide response rate OR state that no participants dropped out/declined participation.
If participants were not allocated into experimental groups, state so OR describe how participants were allocated to groups, and if allocation was not random, describe how covariates were controlled.

Ecological, evolutionary & environmental sciences study design
All studies must disclose on these points even when the disclosure is negative. Note the sampling procedure. Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient.
Describe the data collection procedure, including who recorded the data and how.
Indicate the start and stop dates of data collection, noting the frequency and periodicity of sampling and providing a rationale for these choices. If there is a gap between collection periods, state the dates for each sample cohort. Specify the spatial scale from which the data are taken If no data were excluded from the analyses, state so OR if data were excluded, describe the exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.
Describe the measures taken to verify the reproducibility of experimental findings. For each experiment, note whether any attempts to repeat the experiment failed OR state that all attempts to repeat the experiment were successful.
Describe how samples/organisms/participants were allocated into groups. If allocation was not random, describe how covariates were controlled. If this is not relevant to your study, explain why.
Describe the extent of blinding used during data acquisition and analysis. If blinding was not possible, describe why OR explain why blinding was not relevant to your study.
Describe the study conditions for field work, providing relevant parameters (e.g. temperature, rainfall).
State the location of the sampling or experiment, providing relevant parameters (e.g. latitude and longitude, elevation, water depth).
Describe the efforts you have made to access habitats and to collect and import/export your samples in a responsible manner and in compliance with local, national and international laws, noting any permits that were obtained (give the name of the issuing authority, the date of issue, and any identifying information).
Describe any disturbance caused by the study and how it was minimized.

March 2021
Antibodies Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information.

Manufacturer notes on validation:
Anti-Gad67 (Millipore MAB5406): Reacts with the 67kDa isoform of Glutamate Decarboxylase (GAD67) of rat, mouse and human origins, other species not yet tested. No detectable cross reactivity with GAD65 by Western blot on rat brain lysate when compared to blot probed with AB1511 that reacts with both GAD65 & GAD67.
Anti-somatostatin (Millipore MAB354): Recognizes Somatostatin. Shows no cross-reactivity to enkephalins, other endorphins, substance P or CGRP. Partially cross-reacts with somatostatin fragments. Anti-HuCD (Thermo 16A11): This antibody recognizes the Elav family members HuC, HuD and Hel-N1 neuronal proteins. It does not recognize HuR, another Elav family member that is present in all proliferating cells. The antibody has been shown to specifically label neuronal cells in zebrafish, chick, canaries, and humans, and is likely to label neuronal cells in most vertebrate species. Labeling is visible early in development, at about the time that the neurons leave the mitotic cycle. Anti-NeuN (Thermo MAB377): MILLIPORE's exclusive monoclonal antibody to vertebrate neuron-specific nuclear protein called NeuN (or Neuronal Nuclei) reacts with most neuronal cell types throughout the nervous system of mice including cerebellum, cerebral cortex, hippocampus, thalamus, spinal cord and neurons in the peripheral nervous system including dorsal root ganglia, sympathetic chain ganglia and enteric ganglia. Developmentally, immunoreactivity is first observed shortly after neurons have become postmitotic, no staining has been observed in proliferative zones. The immunohistochemical staining is primarily localized in the nucleus of the neurons with lighter staining in the cytoplasm. The few cell types not reactive with MAB377 include Purkinje, mitral and photoreceptor cells. The antibody is an excellent marker for neurons in primary cultures and in retinoic acid-stimulated P19 cells. It is also useful for identifying neurons in transplants. Cells used for SF3B1 studies were tested for mutation status. SF3B1 genotype was confirmed in all lines by Sanger sequencing of exons 13-24. STR cell line authentication and mycoplasma testing was performed prior to the beginning of studies. All other cells were not authenticated.

Cell lines were not tested for mycoplasma contamination
No commonly misidentified lines were used Provide provenance information for specimens and describe permits that were obtained for the work (including the name of the issuing authority, the date of issue, and any identifying information). Permits should encompass collection and, where applicable, export.
Indicate where the specimens have been deposited to permit free access by other researchers.
If new dates are provided, describe how they were obtained (e.g. collection, storage, sample pretreatment and measurement), where they were obtained (i.e. lab name), the calibration program and the protocol for quality assurance OR state that no new dates are provided.

March 2021
Ethics oversight Note that full information on the approval of the study protocol must also be provided in the manuscript.

Animals and other organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research Laboratory animals

Wild animals
Field-collected samples

Ethics oversight
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Human research participants
Policy information about studies involving human research participants Population characteristics

Recruitment
Ethics oversight Note that full information on the approval of the study protocol must also be provided in the manuscript.

Clinical data Policy information about clinical studies
All manuscripts should comply with the ICMJEguidelines for publication of clinical research and a completedCONSORT checklist must be included with all submissions.

Clinical trial registration
Study protocol

Data collection
Outcomes Dual use research of concern Policy information about dual use research of concern Hazards Could the accidental, deliberate or reckless misuse of agents or technologies generated in the work, or the application of information presented in the manuscript, pose a threat to:

No Yes
Public health National security Crops and/or livestock Ecosystems Any other significant area Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.
Male and female C57BL/6J mice and rds mice were obtained from The Jackson Laboratory, as indicated in materials and methods. Two month old C57BL/6J mice were used for testing SLED vectors in normal tissue. P0 Rds mice were used for intravitreal retina injections of PRPH2 expressing vectors and aged to 3-4 months before OCT and ERG analysis. All mice were housed in 14-hour light/10-hour dark cycles in ambient temperature and humidity.
No wild animals were used No field-collected samples were used All animals were treated in accordance with the Johns Hopkins University Animal Care and Use Committee (IACUC) guidelines, protocl MO22M22 Describe the covariate-relevant population characteristics of the human research participants (e.g. age, gender, genotypic information, past and current diagnosis and treatment categories). If you filled out the behavioural & social sciences study design questions and have nothing to add here, write "See above." Describe how participants were recruited. Outline any potential self-selection bias or other biases that may be present and how these are likely to impact results.
Identify the organization(s) that approved the study protocol.
Provide the trial registration number from ClinicalTrials.gov or an equivalent agency.
Note where the full trial protocol can be accessed OR if not available, explain why.
Describe the settings and locales of data collection, noting the time periods of recruitment and data collection.
Describe how you pre-defined primary and secondary outcome measures and how you assessed these measures.

March 2021
Experiments of concern Does the work involve any of these experiments of concern: No Yes Confirm that both raw and final processed data have been deposited in a public database such as GEO.
Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks.

Data access links
May remain private before publication. The axis labels state the marker and fluorochrome used (e.g. CD4-FITC).

Files in database submission
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.

Methodology
Sample preparation

Instrument
For "Initial submission" or "Revised version" documents, provide reviewer access links. For your "Final submission" document, provide a link to the deposited data.
Provide a list of all files available in the database submission.
Provide a link to an anonymized genome browser session for "Initial submission" and "Revised version" documents only, to enable peer review. Write "no longer applicable" for "Final submission" documents.
Describe the experimental replicates, specifying number, type and replicate agreement.
Describe the sequencing depth for each experiment, providing the total number of reads, uniquely mapped reads, length of reads and whether they were paired-or single-end.
Describe the antibodies used for the ChIP-seq experiments; as applicable, provide supplier name, catalog number, clone name, and lot number.
Specify the command line program and parameters used for read mapping and peak calling, including the ChIP, control and index files used.
Describe the methods used to ensure data quality in full detail, including how many peaks are at FDR 5% and above 5-fold enrichment.
Describe the software used to collect and analyze the ChIP-seq data. For custom code that has been deposited into a community repository, provide accession details.
After transfection or AAV transduction of SLED vectors, cells were processed into single cell suspensions ready for FACS