LINE-1 retrotransposons contribute to mouse PV interneuron development

Retrotransposons are mobile DNA sequences duplicated via transcription and reverse transcription of an RNA intermediate. Cis-regulatory elements encoded by retrotransposons can also promote the transcription of adjacent genes. Somatic LINE-1 (L1) retrotransposon insertions have been detected in mammalian neurons. It is, however, unclear whether L1 sequences are mobile in only some neuronal lineages or therein promote neurodevelopmental gene expression. Here we report programmed L1 activation by SOX6, a transcription factor critical for parvalbumin (PV) interneuron development. Mouse PV interneurons permit L1 mobilization in vitro and in vivo, harbor unmethylated L1 promoters and express full-length L1 mRNAs and proteins. Using nanopore long-read sequencing, we identify unmethylated L1s proximal to PV interneuron genes, including a novel L1 promoter-driven Caps2 transcript isoform that enhances neuron morphological complexity in vitro. These data highlight the contribution made by L1 cis-regulatory elements to PV interneuron development and transcriptome diversity, uncovered due to L1 mobility in this milieu.


Cell sorting and nucleic acid isolation
Neonate litters were obtained from time-mated C57BL/6 mice bred in-house at the QBI animal facility.The day of birth was defined as postnatal day 0 (P0).From each P0 litter of ~6 pups we dissected and pooled hippocampus tissue.Tissues were dissociated in a papain solution, containing approximately 20U papain (Worthington) and 0.025mg DNase I (Worthington).Prior to use, papain was dissolved in HBSS (Gibco) with 1.1mM EDTA (Invitrogen), 0.067mM mercaptoethanol (Sigma) and 5mM cysteine-HCL (Sigma), and diluted in Hibernate E medium (Gibco).Tissue was incubated for 10min at 37 o C with 0.5mL papain solution per embryo.
Following digestion, the cell suspension was passed through a 70μm mesh cell strainer, washed into Hibernate E supplemented with B27 (Gibco) and then centrifuged at 300g for 5min.From this point in the protocol onwards, reagents were pre-chilled and the remaining procedures performed on ice.The cell pellet was resuspended in a blocking buffer (HBSS with 5% BSA).A rabbit anti-PV conjugated Alexa Fluor 647 antibody (Bioss bs-1299R-A647, dilution 1:2000) was directly added to the blocking buffer cell suspension and incubated for 1h at 4 o C, then passed through a 40μm mesh cell strainer and subjected to flow cytometry.The cell suspension was run through a 100μm nozzle at low pressure (28psi) on a BD FACSAria II flow cytometer (Becton Dickinson).This first sort isolated PV + and PV -cells.To further isolate PV -neurons, PV -cells from the first sort were collected in tubes containing 40U RNAseOUT ribonuclease inhibitor (Invitrogen), then fixed in ice cold 50% ethanol for 5min and centrifuged at 300g for 7min.Following centrifugation, cells were immunostained in blocking buffer containing mouse anti-beta III Tubulin (Tub) conjugated Alexa Fluor 488 antibody (Abcam ab195879, dilution 1:1000) and DAPI (Sigma D9542, 1μg/mL) for 15min at 4 o C. Tub + immunostained cells were subjected to a second sort on the same FACS machine and specification as above.Four populations of cells were collected: PV + and PV -(Supplementary Fig. 1a, sort 1) and PV -/Tub + and PV -/Tub -(Supplementary Fig. 1a, sort 2).DNA and RNA were then extracted from each cell population.For RNA extractions, cells were sorted directly into the lysis buffer provided in the NucleoSpin RNA XS kit (Macherey Nagel), with RNA extraction performed following the manufacturer's specifications, except DNAse treatment was performed on a column twice for 20min, instead of once for 15min.For DNA extraction, purified cells were collected into a DNA lysis buffer containing TE buffer (10mM Tris-HCl pH 8 and 0.1mM EDTA), 2% SDS and 100μg/mL proteinase K, and DNA was extracted following a standard phenol-chloroform protocol.

L1 T F promoter bisulfite sequencing
Targeted bisulfite sequencing was performed as described previously 54 to assess L1 TF 5ʹUTR monomer CpG methylation genome-wide.Briefly, this involved extraction of genomic DNA from PV + , PV -and PV -/Tub + populations purified from hippocampus tissue pooled from neonate littermates (Supplementary Fig. 1).Approximately 4×10 4 events per population were obtained from each of 3 litters (experimental triplicates).DNA was extracted via a conventional phenolchloroform method and ethanol precipitation aided by glycogen (Ambion).DNA concentration was assessed with a Qubit dsDNA HS assay kit.Next, 20ng of genomic DNA was bisulfite converted using the EZ-DNA Methylation Lightning kit (Zymo Research, Cat# D5030) following the manufacturer's specifications.Bisulfite PCR reactions used MyTaq HS DNA polymerase (Bioline), and contained 1× reaction buffer, 12.5pmol of each primer, 2µL bisulfite treated DNA input template and 1U of enzyme in a 25µL final volume.PCR cycling conditions were as follows: 95 o C for 2min, followed by 40 cycles of 95 o C, 30sec; 54 o C, 30sec; 72 o C, 30sec and 1 cycle of 72 o C, 5min.Primer sequences (BS_L1_TF_F and BS_L1_TF_R) were as provided in Supplementary Table 4. PCR products were visualized by electrophoresis on a 2% agarose gel, followed by the excision of fragments of expected size and DNA extracted using a MinElute gel extraction kit (Qiagen, Cat# 28604) following the manufacturer's specifications.DNA concentration was assessed with a Qubit dsDNA HS assay kit and 30ng converted DNA was used as input for library preparation.Libraries were prepared using a NEBNext Ultra II DNA library prep kit (NEB, Cat# E7645S) and NEBNext Multiplex Oligos for Illumina (NEB, Cat# E6609S).
Libraries were eluted in 15µL H20 and concentrations measured with an Agilent 2100 Bioanalyzer using an Agilent HS DNA kit (Agilent Technologies, Cat# 5067-4627).Barcoded libraries of PV + , PV -and PV -/Tub + populations from each of the 3 litters were mixed in equimolar quantities, diluted to 8nM, and combined with 50% PhiX spike-in control (Illumina, Cat# FC-110-3001).Single-end 300mer sequencing was then performed on a MiSeq platform (Illumina) using a MiSeq Reagent v3 kit (Illumina, Cat# MS-102-3003).Data were then analyzed as described elsewhere 17 .To summarize, reads with the L1 TF bisulfite PCR primers at their termini were retained and aligned to the mock converted TF monomer target amplicon sequence with blastn.Reads where non-CpG cytosine bisulfite conversion was <95%, or ≥5% of CpG dinucleotides were mutated, or ≥5% of adenine and guanine nucleotides were mutated, were removed.100 reads per triplicate cell population, excluding identical bisulfite sequences, were randomly selected and analyzed using QUMA 73 version 1.1.16with default parameters, with strict CpG recognition.
Embryos were exposed via a laparotomy and 0.5-1.0μL of plasmid DNA combined with 0.0025% Fast Green dye, to aid visualization, was injected into the lateral ventricle of each embryo using a glass-pulled pipette connected to a Picospritzer II (Parker Hannifin).Injections involved either combinations of pUBC-L1SM-UBC-EGFP and pmCherry (1μg/μL each) or pMut2-UBC-L1SM-UBC-EGFP and pmCherry (1μg/μL each).Half of the pups from each litter were co-injected with pUBC-L1SM-UBC-EGFP and pmCherry into the left hemisphere and the other half with pMut2-UBC-L1SM-UBC-EGFP and pmCherry into the right hemisphere.Plasmids were directed into the forebrain by placement of 3mm diameter microelectrodes across the head, which delivered 5 (100ms, 1Hz) approximately 36V square wave pulses via an ECM 830 electroporator (BTX).Once embryos were electroporated, uterine horns were replaced inside the abdominal cavity and the incision sutured closed.Dams received 1mL of Ringer's solution subcutaneously and an edible buprenorphine gel pack for pain relief.Dams were monitored daily until giving birth to live pups, which were collected for analysis at P10.All procedures were followed as approved by the University of Queensland Animal Ethics Committee (MRI-UQ/QBI/415/17).

Quantitative PCR on sorted cells and bulk hippocampus
Total RNA extracted from purified PV + , PV -, PV -/Tub + and PV -/Tub -(Supplementary Fig. 1a, sorts 1 and 2) populations was used as input for SYBR Green and TaqMan qPCR assays.qPCR reactions were carried out using 300pg RNA/µL from purified PV + and PV -cells and 100pg RNA/µL from purified PV -/Tub + and PV -/Tub -cells.An RNA integrity number (RIN) above 6, as measured on an Agilent Bioanalyzer (Agilent Technologies, RNA 6000 Pico Kit, Cat# 5067-1513), was set as the minimum cutoff for RNA quality.All qPCRs were carried out on a LightCycler 480 Real-Time PCR system (Roche Life Science).Oligonucleotide PCR primers, as listed in Supplementary Table  4.For each assay, the relative mRNA expression in a particular sample was calculated by the delta delta-CT method, using the negative population in the respective sort as control, i.e.PV + was compared to PV -(Supplementary Fig. 1a, sort 1) and PV -/Tub + compared to PV -/Tub -(Supplementary Fig. 1a, sort 2).As the PV -/Tub + and PV -/Tub -populations were isolated as a result of two sortings in serial, for some assays sufficient RNA was only available to perform qPCR on PV + and PV -populations.For qPCR on bulk hippocampus, tissue was isolated from 12-week old animals housed in standard (STD, N=12) and enriched (ENR, N=14) environments.RNA extraction was performed by Trizol following the manufacturer's specifications (Trizol reagent, Invitrogen Cat# 15596026).Quantitative TaqMan PCR assays were performed as described above, using 40ng of RNA as input.

RNA-seq analysis
The mappability of individual TE copies generally varies as a function of sequencing read length, as well as TE subfamily age and copy number 75,76 .We therefore adopted a prior approach to quantify young mouse (L1 TF) and human (L1Hs) subfamily-level transcript abundance with RNA-seq 20,75,77 .Analyzed datasets included Sams et al. 51 , bulk hippocampus single-end (1×61mer) RNA-seq obtained from wild-type and conditional Ctcf knockout animals (SRA: SRP078142, N=3 pools of 2 animals per group), and Yuan et al. 50bulk single-end (1×49mer) RNA-seq of neurons differentiated in vitro from human induced pluripotent stem cells, with and without LHX6 overexpression (SRA: SRP147748, N=3 per group).For each RNA-seq library, we aligned reads to the reference genome (mouse: mm10, human: hg38) genome assembly with STAR 78 version 2.6 (parameters --twopassMode Basic --outSAMprimaryFlag AllBestScore --winAnchorMultimapNmax 1000 --outFilterMultimapNmax 1000) and marked duplicate reads with Picard MarkDuplicates (http://broadinstitute.github.io/picard).We expected the high copy number and limited divergence of young L1 subfamilies to cause most of the corresponding RNAseq reads to "multi-map" to multiple genomic loci 75,76 .As conceived previously, we assigned multi-map reads a weighting at each of their aligned positions based on the abundance of uniquely mapping reads aligned within 100bp in the same library 20,75,77 .Each position was then assigned a weighting proportionate to the fraction of uniquely mapped reads found there, out of the total number of uniquely mapped reads within 100bp of any mapping position for the multi-mapping read.If no uniquely mapped reads were found near any of the aligned positions for a multi-mapped read, all positions were given an equal weighting.We then intersected the unique and weighted multi-map alignments with RepeatMasker coordinates and produced a total read count for L1 TF (RepeatMasker: "L1Md_T") and L1Hs genome-wide, normalized by dividing by the total mapped read count for that RNA-seq library (tags-per-million).

Bulk ATAC-seq analysis
Mouse cortex ATAC-seq data were previously generated by Mo et al. 24 for excitatory pyramidal neurons (marked by Cam2ka), PV interneurons and VIP interneurons, via the isolation of nuclei tagged in specific cell types (INTACT) method.Paired-end fastq files were obtained from the Sequence Read Archive (SRA identifiers SRR1647880-SRR1647885). Trim Galore (parameters --max_n 2 --length 50 --trim-n) was used to apply CutAdapt 79 to read pairs to trim adapters and low quality bases.Processed reads were aligned to the reference genome (mm10) using bwa mem 80 with parameters (-a) to output all multimapping alignments.Alignments were filtered to keep only those with an alignment score equal to the maximum achieved for that read.The resulting bam files were sorted using samtools 67 .Peaks for each combined pair of duplicate experiments were called using MACS2 81 with default parameters, intersected with young L1 genomic coordinates, and then used to calculate the fraction of reads in each replicate aligned to at least one L1associated peak.

scATAC-seq analyses
Human hippocampus scATAC-seq data reported by Corces et al. 49 were obtained from the SRA (identifiers SRR11442501 and SRR11442502).Read pairs were retained if the corresponding barcode was present in the 10x Genomics scATAC-seq Unique Molecular Identifier (UMI) whitelist (737K version 1) and then processed and aligned to the hg38 reference genome assembly, as per the bulk ATAC-seq analysis above.Cells (UMIs) with fewer than 10,000 uniquely aligned read pairs were discarded.For the human analysis, a cohort of 277 full-length (>5.9kbp)L1Hs elements defined previously 17 were employed.Cells were grouped into populations based on having at least one read aligned within the genomic coordinates of the proximal promoter of a given gene, with these coordinates as follows: PV, chr22:36816079-36818079; VIP, chr6:152749797-152751797; GFAP, chr17:44915750-44917750; EXC (CAMK2A), chr5:150289093-150291093.For each cell population, read depth was calculated across each fulllength L1Hs copy, and these profiles were then summed to represent the L1Hs subfamily.

Environmental enrichment and exercise experimental design
At six weeks of age, CBA×C57BL/6 mice were randomly assigned to either a standard (STD), enriched environment (ENR) or exercise (EXE) group, as described previously 82 .All mice were exposed to their assigned housing condition for 6 weeks.Supplementary Fig. 2: Environmental enrichment or exercise do not impact L1 mRNA abundance in PV interneurons.a, Standard (STD), exercise (EXE) and enriched (ENR) environment housing schematics.Mice (aged 6 weeks) were placed in either STD, EXE or ENR housing for 6 weeks.ENR and EXE housing consisted of a larger cage with nesting materials.EXE housing contained two running wheels to guarantee all mice had access to voluntary wheel running (excluded in the ENR group).ENR mice were also exposed to spatial stimuli; ladders, tunneling objects and toys of various textures, sizes, and shapes for sensory, cognitive and motor stimulation.Between week 10 and 12, ENR mice were exposed three times a week for one hour to 'super-enriched' condition in a larger playground arena with novel toys.b, L1 T F RNA FISH spots (probe A) in PV + /Tub + neurons from STD, EXE and ENR animal CA tissue.N(mice)=3-4.Cells from each mouse are color coded.Solid line: median.Dashed lines: quartiles.c, As for (b) but in DG. d, As per (b), except showing L1 T F RNA FISH spots in PV -/Tub + neurons.e, As for (d), except in DG. f, TaqMan qPCR measuring abundance of the L1 T F mRNA monomeric 5ʹUTR (VIC channel) relative to 5S rRNA (FAM channel) in bulk hippocampus samples from STD, EXE and ENR mice.STD N=12, ENR N=14.g, As for (f), except targeting the L1 T F non-monomeric 5ʹUTR (FAM channel) relative to Gapdh (VIC channel).h, As for (g), except measuring L1 T F non-monomeric 5ʹUTR (FAM channel) relative to URR1 (HEX channel).i, As for (h) except targeting L1 T F ORF2 (FAM channel) relative to URR1 (HEX channel).STD N=9, ENR N=8.j, Pvalb (parvalbumin) mRNA expression in STD, EXE and ENR conditions, relative to Gapdh.Note: values in (f-j) are represented as mean ± SD.Significance testing was via one-way ANOVA with Tukey's post-hoc test comparing means of animals.No significant (P<0.05)differences were detected between groups.

4
, were purchased from Integrated DNA Technologies.SYBR Green assay: PCR reactions were prepared using the Power SYBR Green RNA-to-CT 1 step kit (Applied Biosystems, Cat# 4391112).Reactions contained a 2× Power SYBR Green RT-PCR Mix, 10pmol of each primer, 1µL RNA input template and 1× reverse transcriptase enzyme mix in a 10μL final volume.Cycling conditions were as follows: 48 o C for 30min, 95 o C for 10min, followed by 40 cycles of 95 o C, 15sec; 60 o C, 1min.To assess potential DNA contamination, an L1 TF qPCR using primers L1Md_5UTR_F and L1Md_5UTR_R was performed with and without reverse transcriptase.A three or more cycle difference between experiments run with and without reverse transcriptase, and detection after cycle 30 in the latter, was considered as non-DNA contaminated RNA.TaqMan assay: Applied Biosystems custom L1, URR1 and 5S rRNA TaqMan MGB probes, as listed in Supplementary Table 4, were purchased from Thermo Fisher (Cat# 4316032), as was a proprietary mouse Gapdh combination (Cat# 4352339E).TaqMan qPCR reactions contained: 4× TaqPath 1-Step RT-qPCR multiplex reaction master mix (ThermoFisher, Cat# A28521), 4pmol of each primer, 1pmol probe (with the exception of the ORF2/URR1 TaqMan reaction, for which we used 1pmol ORF2 primers) and 1µL RNA input template in a 10uL final volume.Cycling conditions were as follows: 37 o C for 2min; 50 o C for 15min; 95 o C for 2min, followed by 40 cycles of 95 o C, 3sec; 60 o C, 30sec.TaqMan assays for L1 were multiplexed with assays for either 5S rRNA, Gapdh or URR1 controls.L1 probes were conjugated to VIC or 6FAM fluorophores.Controls were conjugated to HEX, VIC or 6FAM fluorophores.Primer/probe sequences and the associated detection channels are listed in Supplementary Table

Supplementary Fig. 1 :
PV interneuron isolation.a, Schematic of PV + and PV -/Tub + neuron isolation from pooled neonate (P0) litter hippocampal tissue.An anti-PV conjugated antibody (AF647) was used to label and isolate PV + cells.Freshly sorted PV -cells were subsequently labeled with an anti-Tub conjugated antibody (AF488) and sorted again to isolate PV -/Tub + neurons.b, Gating strategy and purity of fluorescence activated cell sorting (FACS) for PV + cells.c, As for (b), except showing PV -/Tub + neurons.d, Quality control qPCR of relative PV mRNA expression in sorted cells.Pvalb (parvalbumin) mRNA enrichment in the PV + population was observed, as expected.Data are relative to Gapdh and presented as mean ± SD. *P=0.015,two-tailed t test, N(litters)=4.

Table 4
Running wheels were excluded from the ENR housing to ensure the effects of physical activity were exclusive to the EXE mice.All mice had ad libitum access to food and water and were housed in a controlled room at 22°C and 45% humidity on a 12:12 hour light/dark cycle.All procedures were approved by The Florey Institute of Neuroscience and Mental Health Animal