Transposable elements (TEs) regulate diverse biological processes, from early development to cancer. Expression of young TEs is difficult to measure with next-generation, single-cell sequencing technologies because their highly repetitive nature means that short complementary DNA reads cannot be unambiguously mapped to a specific locus. Single CELl LOng-read RNA-sequencing (CELLO-seq) combines long-read single cell RNA-sequencing with computational analyses to measure TE expression at unique loci. We used CELLO-seq to assess the widespread expression of TEs in two-cell mouse blastomeres as well as in human induced pluripotent stem cells. Across both species, old and young TEs showed evidence of locus-specific expression with simulations demonstrating that only a small number of very young elements in the mouse could not be mapped back to the reference with high confidence. Exploring the relationship between the expression of individual elements and putative regulators revealed large heterogeneity, with TEs within a class showing different patterns of correlation and suggesting distinct regulatory mechanisms.
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The datasets generated during the current study are available under ArrayExpress accession E-MTAB-9577. We analyzed two-cell RNA-seq data from GSE97778, GSE66390, GSE76687 and GSE71434; ATAC–seq from GSE76642 and GSE66390; H3K9me3 data from GSE97778; H3K4me3 from GSE73952, GSE76687 and GSE71434; H3K27me3 from GSE73952 and GSE76687; and whole-genome bisulfite data from GSE97778 and E-MTAB-9090. We analyzed hiPSC RNA-seq data from GSE47626 and GSE56568; H3K4me3, H3K9me3, H3K27me3 and whole-genome bisulfite data from GSE16265; and H3K4me3 from GSE16256.
For data analysis the code is available in the following GitHub repositories: https://github.com/MarioniLab/CELLOseq, https://github.com/MarioniLab/sarlacc and https://github.com/MarioniLab/long_read_simulations.
Trapnell, C. Defining cell types and states with single-cell genomics. Genome Res. 25, 1491–1498 (2015).
Rozenblatt-Rosen, O., Stubbington, M. J. T., Regev, A. & Teichmann, S. A. The Human Cell Atlas: from vision to reality. Nature 550, 451–453 (2017).
Pijuan-Sala, B. et al. A single-cell molecular map of mouse gastrulation and early organogenesis. Nature 566, 490–495 (2019).
Boroviak, T. et al. Single cell transcriptome analysis of human, marmoset and mouse embryos reveals common and divergent features of preimplantation development. Development 145, dev167833 (2018).
Brocks, D., Chomsky, E., Mukamel, Z., Lifshitz, A. & Tanay, A. Single cell analysis reveals dynamics of transposable element transcription following epigenetic de-repression. Preprint at bioRxiv https://doi.org/10.1101/462853 (2019).
Ge, S. X. Exploratory bioinformatics investigation reveals importance of “junk” DNA in early embryo development. BMC Genomics 18, 200 (2017).
Hackett, J. A., Kobayashi, T., Dietmann, S. & Surani, M. A. Activation of lineage regulators and transposable elements across a pluripotent spectrum. Stem Cell Rep. 8, 1645–1658 (2017).
Huang, Y. et al. Stella modulates transcriptional and endogenous retrovirus programs during maternal-to-zygotic transition. eLife 6, e22345 (2017).
Zhang, W. et al. Zscan4c activates endogenous retrovirus MERVL and cleavage embryo genes. Nucleic Acids Res. 47, 8485–8501 (2019).
Sexton, C. E. & Han, M. V. Paired-end mappability of transposable elements in the human genome. Mob. DNA https://www.researchgate.net/publication/334375540_Paired-end_mappability_of_transposable_elements_in_the_human_genome (2019).
Faulkner, G. J. et al. A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE. Genomics 91, 281–288 (2008).
Lanciano, S. & Cristofari, G. Measuring and interpreting transposable element expression. Nat. Rev. Genet. 21, 721–736 (2020).
Deininger, P. et al. A comprehensive approach to expression of L1 loci. Nucleic Acids Res. 45, e31 (2017).
Philippe, C. et al. Activation of individual L1 retrotransposon instances is restricted to cell-type dependent permissive loci. eLife 5, e13926 (2016).
Macia, A. et al. Epigenetic control of retrotransposon expression in human embryonic stem cells. Mol. Cell Biol. 31, 300–316 (2011).
Garcia-Perez, J. L., Widmann, T. J. & Adams, I. R. The impact of transposable elements on mammalian development. Development 143, 4101–4114 (2016).
Chinwalla, A. T. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Craig, N. L. et al. (eds) Mobile DNA III 3rd edn (ASM Press, 2015).
Deininger, P. L., Batzer, M. A., Hutchison, C. A. & Edgell, M. H. Master genes in mammalian repetitive DNA amplification. Trends Genet. 8, 307–311 (1992).
Griffiths, D. J. Endogenous retroviruses in the human genome sequence. Genome Biol. 2, reviews 1017.1–1017.5 (2001).
Smit, A. F. A., Tóth, G., Riggs, A. D. & Jurka, J. Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. J. Mol. Biol. 246, 401–417 (1995).
Ribet, D. et al. Murine endogenous retrovirus MuERV-L is the progenitor of the “orphan” epsilon viruslike particles of the early mouse embryo. J. Virol. 82, 1622–1625 (2008).
Walsh, C. P., Chaillet, J. R. & Bestor, T. H. Transcription of IAP endogenous retroviruses is constrained by cytosine methylation. Nat. Genet. 20, 116–117 (1998).
Slotkin, R. K. & Martienssen, R. Transposable elements and the epigenetic regulation of the genome. Nat. Rev. Genet. 8, 272–285 (2007).
Berrens, R. V. et al. An endosiRNA-based repression mechanism counteracts transposon activation during global DNA demethylation in embryonic stem cells. Cell Stem Cell 21, 694–703.e7 (2017).
Jachowicz, J. W. et al. LINE-1 activation after fertilization regulates global chromatin accessibility in the early mouse embryo. Nat. Genet. 49, 1502–1510 (2017).
Percharde, M. et al. A LINE1-nucleolin partnership regulates early development and ESC identity. Cell 174, 391–405 (2018).
Lebrigand, K., Magnone, V., Barbry, P. & Waldmann, R. High throughput error corrected Nanopore single cell transcriptome sequencing. Nat. Commun. 11, 4025 (2020).
Peaston, A. E. et al. Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Dev. Cell 7, 597–606 (2004).
Fadloun, A. et al. Chromatin signatures and retrotransposon profiling in mouse embryos reveal regulation of LINE-1 by RNA. Nat. Struct. Mol. Biol. 20, 332–338 (2013).
Van der Verren, S. E. et al. A dual-constriction biological nanopore resolves homonucleotide sequences with high fidelity. Nat. Biotechnol. 38, 1415–1420 (2020).
Karst, S. M. et al. Enabling high-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. Nat. Methods 18, 165–169 (2021).
Hoang, M. L. et al. Genome-wide quantification of rare somatic mutations in normal human tissues using massively parallel sequencing. Proc. Natl Acad. Sci. USA 113, 9846–9851 (2016).
Mincarelli, L., Uzun, V., Rushworth, S. A., Haerty, W. & Macaulay, I. C. Combined single-cell gene and isoform expression analysis in haematopoietic stem and progenitor cells. Preprint at bioRxiv https://doi.org/10.1101/2020.04.06.027474 (2020).
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
Streeter, I. et al. The human-induced pluripotent stem cell initiative—data resources for cellular genetics. Nucleic Acids Res. 45, D691–D697 (2017).
Tang, A. D. et al. Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns. Nat. Commun. 11, 1438 (2020).
Faulkner, G. J. et al. The regulated retrotransposon transcriptome of mammalian cells. Nat. Genet. 41, 563–571 (2009).
Macfarlan, T. S. et al. Embryonic stem cell potency fluctuates with endogenous retrovirus activity. Nature 487, 57–63 (2012).
Klawitter, S. et al. Reprogramming triggers endogenous L1 and Alu retrotransposition in human induced pluripotent stem cells. Nat. Commun. 7, 10286 (2016).
Wissing, S. et al. Reprogramming somatic cells into iPS cells activates LINE-1 retroelement mobility. Hum. Mol. Genet. 21, 208–218 (2012).
Wick, R. R. Badread: simulation of error-prone long reads. J. Open Source Softw. 4, 1316 (2019).
Wang, C. et al. Reprogramming of H3K9me3-dependent heterochromatin during mammalian embryo development. Nat. Cell Biol. 20, 620–631 (2018).
Schöpp, T. et al. TEX15 is an essential executor of MIWI2-directed transposon DNA methylation and silencing. Nat. Commun. 11, 3739 (2020).
Park, S.-J., Shirahige, K., Ohsugi, M. & Nakai, K. DBTMEE: a database of transcriptome in mouse early embryos. Nucleic Acids Res. 43, D771–D776 (2015).
Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009).
Ewing, A. D. et al. Nanopore sequencing enables comprehensive transposable element epigenomic profiling. Mol. Cell 80, 915–928 (2020).
Brouha, B. et al. Hot L1s account for the bulk of retrotransposition in the human population. Proc. Natl Acad. Sci. USA 100, 5280–5285 (2003).
Helman, E. et al. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing. Genome Res. 24, 1053–1063 (2014).
Gardner, E. J. et al. The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res. 27, 1916–1929 (2017).
Pitkänen, E. et al. Frequent L1 retrotranspositions originating from TTC28 in colorectal cancer. Oncotarget 5, 853–859 (2014).
Rodriguez-Martin, B. et al. Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition. Nat. Genet. 52, 306–319 (2020).
Tubio, J. M. C. et al. Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes. Science 345, 1251343 (2014).
Okae, H. et al. Genome-wide analysis of DNA methylation dynamics during early human development. PLoS Genet. 10, e1004868 (2014).
Karst, S. M. et al. High-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing. Nat. Methods 18, 165–169 (2021).
Volden, R. & Vollmers, C. Highly multiplexed single-cell full-length cDNA sequencing of human immune cells with 10X Genomics and R2C2. Preprint at bioRxiv https://doi.org/10.1101/2020.01.10.902361 (2021).
Volden, R. et al. Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA. Proc. Natl Acad. Sci. USA 115, 9726–9731 (2018).
Kilpinen, H. et al. Common genetic variation drives molecular heterogeneity in human iPSCs. Nature 546, 370–375 (2017).
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
Hennig, B. P. et al. Large-scale low-cost NGS library preparation using a robust Tn5 purification and tagmentation protocol. G3 (Bethesda) 8, 79–89 (2018).
Picelli, S. et al. Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 24, 2033–2040 (2014).
BBMap. SourceForge https://sourceforge.net/projects/bbmap/ (2021).
Babraham Bioinformatics. Trim Galore http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (2019).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Smith, T. S., Heger, A. & Sudbery, I. UMI-tools: modelling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Amezquita, R. A. et al. Orchestrating single-cell analysis with Bioconductor. Nat. Methods 17, 137–145 (2020).
Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).
Wick, R. R., Judd, L. M. & Holt, K. E. Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome Biol. 20, 129 (2019).
Wick, R. rrwick/Porechop https://github.com/rrwick/Porechop (2020).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker https://www.repeatmasker.org/faq.html (1996).
Kent, W. J. et al. The Human Genome Browser at UCSC. Genome Res. 12, 996–1006 (2002).
Lun, A. T. L. et al. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 20, 63 (2019).
Zhang, B. et al. Allelic reprogramming of the histone modification H3K4me3 in early mammalian development. Nature 537, 553–557 (2016).
Wu, J. et al. The landscape of accessible chromatin in mammalian preimplantation embryos. Nature 534, 652–657 (2016).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Zheng, H. et al. Resetting epigenetic memory by reprogramming of histone modifications in mammals. Mol. Cell 63, 1066–1079 (2016).
Liu, X. et al. Distinct features of H3K4me3 and H3K27me3 chromatin domains in pre-implantation embryos. Nature 537, 558–562 (2016).
Marchetto, M. C. N. et al. Differential LINE-1 regulation in pluripotent stem cells of humans and other great apes. Nature 503, 525–529 (2013).
Liu, Q. et al. Genome-wide temporal profiling of transcriptome and open chromatin of early cardiomyocyte differentiation derived from hiPSCs and hESCs. Circ. Res. 121, 376–391 (2017).
Guenther, M. G. et al. Chromatin structure and gene expression programs of human embryonic and induced pluripotent stem cells. Cell Stem Cell 7, 249–257 (2010).
Wheeler, D. L. et al. Database resources of the National Center for Biotechnology. Nucleic Acids Res. 31, 28–33 (2003).
Martin, M. et al. WhatsHap: fast and accurate read-based phasing. Preprint at bioRxiv https://doi.org/10.1101/085050 (2016).
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
We thank the Sanger and WIMM cytometry core facility for sorting of hiPSCs. We thank the Sanger, CRUK and WIMM sequencing facility for sequencing NGS data. We thank the WIMM single-cell facility for generating the 10X data. We thank V. Sundaram for fruitful discussions. We thank P. Gould, W. Reik and D. O’Carroll for helpful comments on the manuscript. This research was supported by a Sir Henry Wellcome Fellowship to R.V.B. (no. 213612), an EBPOD Fellowship to A.Y., an HFSP Long Term Fellowship to C.E.L., support from Cancer Research UK (CRUK) (C9545/A29580) and Core support from EMBL to J.C.M. and a Wellcome Trust Fellowship to J.S.B.
The authors declare no competing interests.
Peer review information Nature Biotechnology thanks Geoffrey Faulkner and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Boxplots of counts across the 50 highest expressed young L1 unique loci in mouse ES cells cultured in 2i medium measured by (a) ScNaUmi-seq or (b) CELLO-seq. (c) histogram of number of UMIs (y-axis) by UMI group size (x-axis) for 2-cell blastomeres CELLO-seq data sequenced on a MinION flow cell or Illumina platform. (d) Light field microscopy image of two-cell embryo blastomere isolation. Two-cell embryo (i) with zona pellucida; (ii) without zona pellucida; (iii) as single blastomeres. This experiment was repeated more than 20 times. Scale bar 25μm. (e) bargraph of read numbers for mouse 2-cell embryo dataset (right) and human iPSCs (left). (f) density plot of number of molecules (y-axis) by length of mapped molecules (x-axis) for hiPSCs and mouse blastomeres. (g) scatter plot of number of reads (y-axis) versus number of genes in hiPSCs from Smart-seq2 libraries sequenced by Illumina. (h) schematic of sarlacc workflow. We demultiplexed samples by grouping barcodes with a Levenshtein distance below the grouping threshold. We performed pregrouping by mapping the reads to the relevant transcriptome. We grouped the reads by UMI sequence and error corrected the reads in the true UMI group, or by picking a random read from the UMI group in deduplication mode. For this study we used error corrected reads. (i) barplot of fraction of reads (y-axis) and their relative position on a transcript (x-axis) from the start or the end of the molecule depending on the gene length. (j) Scatter plot of short read (y-axis) versus long read (x-axis) gene expression depending on the length of the gene. (k) Scatter plot of ERCC concentration (y-axis) and ERCC molecules (x-axis) of mouse blastomere CELLO-seq data. (l) Scatter plot of ERCC concentration (y-axis) and ERCC molecules (x-axis) from mouse blastomere CELLO-seq libraries with Illumina sequencing. For J-L Pearson Correlation coefficient (R) and two-sided p-value shown.
(a) expression of TE derived-isoforms in human iPSCs and mouse 2-cell data stratified by whether a repeat acts as a transcript end site (TES) or as a transcript start site (TSS). Mouse: TES (n = 353), TSS (n = 76). Human: TES (n = 537), TSS (n = 73). (B) number of TE derived-isoforms in hiPSCs and mouse blastomeres with repeat as TES or TSS. (c) Barplot of repeat family underlying repeat derived-isoforms in hiPSCs. (d) Barplot of repeat family underlying repeat derived-isoforms in mouse blastomeres. (e) frequency plot of number of repeats (x-axis) by age of TE (mya) (y-axis). TEs that mapped to hiPSCs (top), all repeats from human UCSC repeatmasker annotation (bottom). TEs are grouped by class and color-coded by family. (f) boxplots of age (mya) of TEs (y-axis) of young human L1s either detected or not detected in hiPSC dataset. detected: L1HS (n = 15), L1PA2 (n = 34), L1PA3 (n = 34), L1PA4 (n = 20), L1PA5 (n = 18), L1PA6 (n = 17), not detected: L1HS (n = 1692), L1PA2 (n = 5148), L1PA3 (n = 11194), L1PA4 (n = 12471), L1PA5 (n = 11735), L1PA6 (n = 6195). (g) boxplots of age (mya) of TEs (y-axis) of young mouse L1s either detected or not detected in mouse blastomeres. detected: L1MdA (n = 27), L1MdT (n = 44), L1MdF3 (n = 8), L1MdF2 (n = 43), L1MdGf (n = 1), L1MdF (n = 5), not detected: L1MdA (n = 16817), L1MdT (n = 23644), L1MdF3 (n = 16138), L1MdF2 (n = 64855), L1MdGf (n = 1079), L1MdF (n = 4011). The boxplots in A,F and G show the median, first and third quartiles as a box, and the whiskers indicate the most extreme data point within 1.5 lengths of the box.
(a) bargraph showing the number of reads (y-axis) by the simulation type with either 1x or 10x coverage (x-axis), color-coded by alignment type with mapped = read at correct location after mapping with minimap2 to the genome, mismapped = read maps at wrong location, unmapped = read not mapped, unresolved = group has more than one molecule present and group cannot be resolved to a unique read. mouse (left), human (right). (b) bargraph of proportion of read group sizes (y-axis) by alignment type (x-axis), left showing 1x read coverage, right showing 10x read coverage. Color-coded by group size. mouse (top), human (bottom). (c) Stacked bargraph showing proportion of L1 elements (y-axis) by simulation type using 10x read coverage (x-axis), colored by specificity score, mouse (left), human (right). (d) Jitter plot of TE subfamily (y-axis) by TE age (million years ago) grouped by simulation type and coloured by % of mapped reads with yellow being 0% mapped and dark blue being 100% mapped. Mouse L1 top panel and human L1 bottom panel. Simulation type: perfect = perfect read identity, ONT = ONT read identity, ONT 5x = ONT read identity with 5x coverage, sarlacc corrected 5x = ONT read identity score, 5x coverage with sarlacc error correction, sarlacc corrected 10x = ONT read identity score, 10x coverage with sarlacc error correction, sarlacc deduplicated 5x = ONT read identity score, 5x coverage with sarlacc deduplication by randomly choosing 1 read. PG = perfect grouping.
(a) Distribution of Levenshtein distance between randomly simulated UMI (x-axis) based on UMI length with RYN pattern (left) or NNN pattern (right). Light grey bar shows distance threshold for grouping of reads by UMIs used for most short read UMIs or CELLO-seq. (b) Line graph of fraction of pure groups (y-axis) by Levenshtein distance (x-axis) by UMI group, either with perfect read identity or ONT read identity. On the left is the line graph of UMI simulations without any pregrouping by mapping. On the right the line graph is UMI simulation where pregrouping was performed by random assignment of true UMI sequences into groups of 100 unique UMIs. (c) distribution plot of UMI group sizes (x-axis) by Levenshtein distance threshold (y-axis) based on UMI length, with perfect ONT read identity and no pregrouping (left) or pregrouping (right).
(a) Heatmap of expression of all SINE elements in mouse blastomeres, with rows clustered by SINE family and colour-coded by TE subfamily. (b) Heatmap of expression of full-length (>5000nt) elements in mouse blastomeres, with rows clustered by TE family and color-coded by TE subfamily. (c) Heatmap of logcounts of highest expressed (mean expression > 1) elements in hiPSCs with rows clustered by TE subfamily. (d) Boxplot of percentage of reads mapped to TEs or TE families in CELLO-seq mouse 2-cells. P-value: L1Md to SINE B1/B2 = 0.004998, L1Md to MERVL = 0.004998, 2-sided Wilcoxon rank sum test. n = 6 cells. (e) Boxplot of percentage of TEs expressed by number of TEs in the genome in CELLO-seq mouse blastomeres. p-value: repeats to L1Md = 0.0022, repeats to SINE B1/B2 = 0.0022, repeats to MERVL = 0.0022, 2-sided Wilcoxon rank sum test. n = 6 cells. (f) boxplot of number of MERVL elements expressed in each cell of CELLO-seq 2-cells compared to published short read data. CELLO-seq (n = 6 cells), bulk (n = 7 independent experiments). (g) boxplot of number of HERVH-int elements expressed in each cell of CELLO-seq compared to published short read data. CELLO-seq (n = 96 cells), bulk (n = 10 independent experiments). (h) expression, methylation, ATAC-seq and ChIP-seq of MERVL elements with read counts in CELLO-seq libraries compared to MERVL elements with no counts in CELLO-seq libraries. expressed (n = 355 MERVLs), not expressed (n = 41 MERVLs), datasets: ATAC-seq (n = 1), DNA methylation (n = 2), H3K27me3 (n = 1), H3K4me3 (n = 1), H3K9me3 (n = 1), RNAseq (n = 3). (i) expression, methylation, ATAC-seq and ChIP-seq data of HERV-int elements with read counts in CELLO-seq libraries compared to HERVH-int with no counts in CELLO-seq libraries. expressed (n = 14 HERVH-ints), not expressed (n = 110 HERVH-ints), each dataset (n = 1), RNAseq (n = 3). The boxplots shown in D-I show the median, first and third quartiles as a box, and the whiskers indicate the most extreme data point within 1.5 lengths of the box.
(a) DNA methylation of L1Md elements expressed in CELLO-seq mouse blastomeres. Methylation level of L1Mds across preimplantation development and in spermatogonia. Bold: L1s with full-length ORF by ORFfinder. (b) Methylation level of L1Mds across early development in human iPS cells as well as in tumour and normal tissue. Bold: L1s with full-length ORF by ORFfinder, underlined: L1s known to be mobile according to previous publications. (c-d) genome browser view of CELLO-seq reads overlapping young L1s (c) in mouse or (d) human. Arrows show direction of transcription of each L1 element.
Median accuracy and read number per coverage of error-corrected and deduplicated reads with CELLO-seq.
Specificity score of L1 elements used in this study. We used only L1 elements with specificity score >80%.
Reads overlapping young L1 elements for each cell.
Information of mobility of young L1 elements transcribed according to CELLO-seq.
Oligonucleotide sequences used in this manuscript.
About this article
Cite this article
Berrens, R.V., Yang, A., Laumer, C.E. et al. Locus-specific expression of transposable elements in single cells with CELLO-seq. Nat Biotechnol (2021). https://doi.org/10.1038/s41587-021-01093-1