G&T-seq: parallel sequencing of single-cell genomes and transcriptomes

Abstract

The simultaneous sequencing of a single cell's genome and transcriptome offers a powerful means to dissect genetic variation and its effect on gene expression. Here we describe G&T-seq, a method for separating and sequencing genomic DNA and full-length mRNA from single cells. By applying G&T-seq to over 220 single cells from mice and humans, we discovered cellular properties that could not be inferred from DNA or RNA sequencing alone.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: G&T-seq enables integrated analysis of the genome and transcriptome of a single cell.
Figure 2: Simultaneous detection of chromosomal aneuploidy and gene expression dosing in single cells.

Accession codes

Primary accessions

ArrayExpress

Gene Expression Omnibus

References

  1. 1

    Xu, X. et al. Cell 148, 886–895 (2012).

    CAS  Article  Google Scholar 

  2. 2

    Shapiro, E., Biezuner, T. & Linnarsson, S. Nat. Rev. Genet. 14, 618–630 (2013).

    CAS  Article  Google Scholar 

  3. 3

    Voet, T. et al. Nucleic Acids Res. 41, 6119–6138 (2013).

    CAS  Article  Google Scholar 

  4. 4

    Cai, X. et al. Cell Rep. 8, 1280–1289 (2014).

    CAS  Article  Google Scholar 

  5. 5

    Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. Cell Rep. 2, 666–673 (2012).

    CAS  Article  Google Scholar 

  6. 6

    Ramsköld, D. et al. Nat. Biotechnol. 30, 777–782 (2012).

    Article  Google Scholar 

  7. 7

    Yan, L. et al. Nat. Struct. Mol. Biol. 20, 1131–1139 (2013).

    CAS  Article  Google Scholar 

  8. 8

    Jaitin, D.A. et al. Science 343, 776–779 (2014).

    CAS  Article  Google Scholar 

  9. 9

    Pollen, A.A. et al. Nat. Biotechnol. 32, 1053–1058 (2014).

    CAS  Article  Google Scholar 

  10. 10

    Shalek, A.K. et al. Nature 510, 363–369 (2014).

    CAS  Article  Google Scholar 

  11. 11

    Klein, C.A. et al. Nat. Biotechnol. 20, 387–392 (2002).

    CAS  Article  Google Scholar 

  12. 12

    Gužvic´, M. et al. Cancer Res. 74, 7383–7394 (2014).

    Article  Google Scholar 

  13. 13

    Picelli, S. et al. Nat. Methods 10, 1096–1098 (2013).

    CAS  Article  Google Scholar 

  14. 14

    Picelli, S. et al. Nat. Protoc. 9, 171–181 (2014).

    CAS  Article  Google Scholar 

  15. 15

    Gazdar, A.F. et al. Int. J. Cancer 78, 766–774 (1998).

    CAS  Article  Google Scholar 

  16. 16

    Stephens, P.J. et al. Nature 462, 1005–1010 (2009).

    CAS  Article  Google Scholar 

  17. 17

    Dean, F.B. et al. Proc. Natl. Acad. Sci. USA 99, 5261–5266 (2002).

    CAS  Article  Google Scholar 

  18. 18

    Langmore, J.P. Pharmacogenomics 3, 557–560 (2002).

    Article  Google Scholar 

  19. 19

    de Bourcy, C.F. et al. PLoS One 9, e105585 (2014).

    Article  Google Scholar 

  20. 20

    D'Alise, A.M. et al. Mol. Cancer Ther. 7, 1140–1149 (2008).

    CAS  Article  Google Scholar 

  21. 21

    Santaguida, S., Tighe, A., D'Alise, A.M., Taylor, S.S. & Musacchio, A. J. Cell Biol. 190, 73–87 (2010).

    CAS  Article  Google Scholar 

  22. 22

    Letourneau, A. et al. Nature 508, 345–350 (2014).

    CAS  Article  Google Scholar 

  23. 23

    McConnell, M.J. et al. Science 342, 632–637 (2013).

    CAS  Article  Google Scholar 

  24. 24

    Mitelman, F., Johansson, B. & Mertens, F. Nat. Rev. Cancer 7, 233–245 (2007).

    CAS  Article  Google Scholar 

  25. 25

    Stratton, M.R., Campbell, P.J. & Futreal, P.A. Nature 458, 719–724 (2009).

    CAS  Article  Google Scholar 

  26. 26

    Ha, K.C. et al. BMC Med. Genomics 4, 75 (2011).

    CAS  Article  Google Scholar 

  27. 27

    Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Nat. Biotechnol. 33, 285–289 (2015).

    CAS  Article  Google Scholar 

  28. 28

    Park, I.H. et al. Cell 134, 877–886 (2008).

    CAS  Article  Google Scholar 

  29. 29

    Shi, Y. et al. Sci. Transl. Med. 4, 124ra129 (2012).

    Google Scholar 

  30. 30

    Shi, Y., Kirwan, P., Smith, J., Robinson, H.P. & Livesey, F.J. Nat. Neurosci. 15, 477–486, S471 (2012).

    CAS  Article  Google Scholar 

  31. 31

    Chambers, S.M. et al. Nat. Biotechnol. 27, 275–280 (2009).

    CAS  Article  Google Scholar 

  32. 32

    Shi, Y., Kirwan, P. & Livesey, F.J. Nat. Protoc. 7, 1836–1846 (2012).

    CAS  Article  Google Scholar 

  33. 33

    Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).

    CAS  Article  Google Scholar 

  34. 34

    Quinlan, A.R. & Hall, I.M. Bioinformatics 26, 841–842 (2010).

    CAS  Article  Google Scholar 

  35. 35

    Baslan, T. et al. Nat. Protoc. 7, 1024–1041 (2012).

    CAS  Article  Google Scholar 

  36. 36

    Møller, E.K. et al. Front. Oncol. 3, 320 (2013).

    Article  Google Scholar 

  37. 37

    DePristo, M.A. et al. Nat. Genet. 43, 491–498 (2011).

    CAS  Article  Google Scholar 

  38. 38

    Marcel, M. EMBnet.journal 17, 10–12 (2011).

    Google Scholar 

  39. 39

    Trapnell, C. et al. Nat. Protoc. 7, 562–578 (2012).

    CAS  Article  Google Scholar 

  40. 40

    Li, B., Ruotti, V., Stewart, R.M., Thomson, J.A. & Dewey, C.N. Bioinformatics 26, 493–500 (2010).

    Article  Google Scholar 

  41. 41

    Love, M.I., Huber, W. & Anders, S. Genome Biol. 15, 550 (2014).

    Article  Google Scholar 

  42. 42

    Kharchenko, P.V., Silberstein, L. & Scadden, D.T. Nat. Methods 11, 740–742 (2014).

    CAS  Article  Google Scholar 

  43. 43

    Kim, D. et al. Genome Biol. 14, R36 (2013).

    Article  Google Scholar 

  44. 44

    McPherson, A. et al. PLoS Comput. Biol. 7, e1001138 (2011).

    CAS  Article  Google Scholar 

  45. 45

    Kent, W.J. Genome Res. 12, 656–664 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46

    Piskol, R., Ramaswami, G. & Li, J.B. Am. J. Hum. Genet. 93, 641–651 (2013).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We thank the Wellcome Trust Sanger Institute (UK) sequencing pipelines and F. Yang of the Cytogenetics Core Facility. This work was supported by the UK Wellcome Trust (to T.V. and C.P.P.) and funding from the Belgian Research Foundation Flanders (FWO) and the University of Leuven (KU Leuven, Belgium) to T.V. (FWO–G.0687.12; KU Leuven SymBioSys, PFV/10/016). N.V.d.A. is supported by an FWO scholarship (FWO–1.1.H28.12). W.H. and C.P.P. are funded by the UK Medical Research Council. L.M.S. was funded by the EU Seventh Framework Programme (FP7/2007-2013) under grant 262055. M.Z.-G. and the work in the lab are funded by the UK Wellcome Trust. M.G. is supported by a UK Mary Gray Studentship from St. John's College, Cambridge, UK. N.S. was supported by the New Zealand Woolf-Fisher Trust. F.J.L. is supported by a UK Wellcome Trust Senior Investigator award. M.J.T. is supported by a Wellcome Trust Sanger Institute Clinical Ph.D. Fellowship (UK). Y.I.L. was supported by a University of Oxford Nuffield Department of Medicine Prize Studentship, UK. Trisomy 21 iPSCs were obtained from the Harvard Stem Cell Institute (Cambridge, Massachusetts, USA), and control iPSCs were a gift from Y. Takashima (Cambridge Stem Cell Institute, Cambridge, UK).

Author information

Affiliations

Authors

Contributions

I.C.M. developed the method, performed experiments, analyzed data and wrote the paper. W.H., P.K., Y.I.L. and T.X.H. analyzed data and prepared figures and text for the paper. M.J.T. performed experiments and assisted with method development. N.V.d.A. provided cells and assisted with method development. M.G. and M.Z.-G. provided mouse blastomeres. N.S. and F.J.L. provided iPSC-derived neurons. P.C., L.M.S., M.S., P.D.E., M.A.Q. and H.P.S. assisted with library preparation for targeted, HiSeq X and PacBio sequencing. R.B. performed cytogenetic analysis of cell lines. C.P.P. and T.V. acquired funding, oversaw the research, designed the method, analyzed data and wrote the paper. All authors read and approved the manuscript for submission.

Corresponding authors

Correspondence to Iain C Macaulay or Chris P Ponting or Thierry Voet.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Performance of G&T-seq whole-genome amplification in HCC38 and HCC38-BL cells.

(a) Copy-number concordance between bulk DNA sequencing of HCC38(-BL) cells and single-cell or multicell G&T-seq following MDA or PicoPlex WGA. For reference, single-cell DNA copy-number concordances obtained with conventional MDA and PicoPlex are shown. (b) Heat map of the genome-wide DNA copy number (LogR) in single cells and in multicell controls isolated from HCC38 and HCC38-BL cells and amplified using MDA. For reference, the copy-number profile derived from bulk HCC38 DNA (not subjected to WGA) is shown on the left. (c) Lorenz curve illustrating the relationship between the cumulative fraction of the genome covered (x-axis) and the cumulative fraction of mapped bases (y-axis). (d) Normalized read count as a function of %GC content. The distributions are shown for all HCC38 G&T-seq samples amplified with MDA (purple) and PicoPlex (green). For comparison, the distributions for bulk (no WGA, blue), conventional single-cell MDA (black) and conventional single-cell PicoPlex (orange) are shown.

Supplementary Figure 2 Performance of G&T-seq whole-transcriptome amplification in HCC38 and HCC38-BL cells.

(a) Transcript detection following G&T-seq of HCC38 and HCC38-BL single cells. The number of expressed genes (y-axis) in HCC38 single cells (red lines) and HCC38-BL single cells (blue lines) versus TPM (x-axis). At TPM > 1 (dashed line), between 4,000 and 11,000 transcripts were detected per cell, with substantially more transcripts detected in HCC38 cells. (b) Principal-component analysis of HCC38 and HCC38BL single-cell transcriptomes. Cells in which genomic aneuploidies were detected are highlighted. (c) Heat map displaying Spearman correlation of 8,237 protein-coding genes expressed in at least 32 samples with TPM > 1. (d) Expanded heat map showing the top 200 differentially expressed genes between HCC38 and HCC38-BL cells. The TPM of each gene is ‘normalized’ by the median of the TPM of this gene across all samples and is presented as the log2-fold difference from this median.

Supplementary Figure 3 Sequence coverage over transcript length and intronic and gene flanking regions in single-cell G&T-seq transcriptome data.

Read coverage in (a) 2 kb, (b) 10 kb and (c) 15 kb transcripts is shown. Numbers indicate the distance from the poly(A) tail in the exonic region only. Regions upstream of the transcription start site (TSS) and transcription termination site (TTS), as well as intronic regions, are also shown.

Supplementary Figure 4 Comparison of RNA-seq data generated with the G&T-seq and conventional Smart-seq2 protocols.

In this comparison, 28 single cells (8 HCC38 and 20 HCC38-BL single cells) were used for G&T-seq, and 20 single cells (14 HCC38 and 6 HCC38-BL single cells) were applied for conventional Smart-seq2. Importantly, these cells came from the same cultures, were isolated at the same time, were processed (when possible) with the same batches of reagents, and were eventually sequenced together. (a) Transcript detection following G&T-seq or conventional Smart-seq2 amplification of HCC38 and HCC38-BL single cells. The number of transcripts detected at TPM > 1 is displayed. (b) Detection of ERCC transcripts relative to ERCC input amount; the plot shows the averaged normalized read count across all single-cell samples in a G&T-seq experiment versus the number of molecules of each ERCC sequence that was spiked in. (c) Detection of ERCC transcripts relative to ERCC input amount in a parallel Smart-seq2 experiment. (d) Sequence coverage over transcript length and intronic and gene flanking regions in single-cell G&T-seq and Smart-seq2 transcriptome data. Read coverage in 2 kb transcripts is shown. Numbers indicate the distance from the poly(A) tail in the exonic region only. Regions upstream of the transcription start site (TSS) and transcription termination site (TTS), as well as intronic regions, are also shown. (e) Transcript detection in bins of transcript GC content for HCC38 and HCC38-BL single-cell transcriptomes generated by G&T-seq and Smart-seq2 (SS2). The upper panel shows the proportion of genes detected in each bin, and the lower panel displays the proportion of GC content in each bin.

Supplementary Figure 5 Interphase FISH to detect trisomy 11 in a subset of HCC38-BL cells.

Chromosomes 11 and 3 were hybridized with a centromeric probe (labeled with FITC and Texas Red, respectively). The majority of HCC38-BL cells had disomy 11 (a), whereas trisomy 11 was observed in 2 out of 100 HCC38-BL cells analyzed (b).

Supplementary Figure 6 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo A) containing sister cells with reciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). Cell 1 failed QC at the genome level. Reciprocal aneuploidies were observed for cells 4 and 5 at chromosomes 2, 5 and 16. (b) Genome-wide expression binned per chromosome in the control (n = 16 cells, untreated and shown in blue) and reversine-treated (n = 8 cells, shown in red) embryos (RPKM of the latter are relative to the median-centered control RPKMs). The expected expression dosage resulting from the aneuploidies for chromosomes 2, 5 and 16 in the blastomeres (cells 4 and 5) was detected in the correct cell’s transcriptome. Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with a black asterisk. Cell 1, which also failed DNA-seq QC, is highlighted with a red asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 7 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo B) containing sister cells with reciprocal and nonreciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). A complex pattern of aneuploidy was observed in cell 2 (gain of chromosomes 4 and 16 and loss of chromosomes 5, 14, 18 and 19). Cell 3 had a gain in chromosome 15, and cell 5 had a gain in chromosome 8, while cell 8 gained an X-chromosome. (b) Genome-wide expression binned per chromosome comparing the cells from embryo B (reversine-treated, shown in red, n = 8) with those from control embryos (n = 16 cells, untreated and shown in blue). Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with an asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 8 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo C) containing sister cells with reciprocal and nonreciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). Cell 1 failed QC following DNA-seq. Cell 2 had a loss of chromosome 11, whereas cells 3 and 6 showed reciprocal gains and losses at chromosomes 13 and 14. Cell 8 had lost a copy of chromosome 13. (b) Genome-wide expression binned per chromosome comparing the cells from embryo C (reversine-treated, shown in red, n = 8) with those from control embryos (n = 16 cells, untreated and shown in blue). Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with a black asterisk. Cell 1, which failed DNA-seq QC, is highlighted with a red asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 9 Relationship between chromosomal copy number and chromosome-wide expression in a mouse embryo at the eight-cell stage.

Reversine-treated mouse embryo at the eight-cell stage (embryo E) containing sister cells with reciprocal and nonreciprocal aneuploidies. (a) The genome-wide copy-number profile is shown for all eight cells in the embryo (numbered 1–8). Cells 1 and 4 had reciprocal aneuploidies for chromosomes 4, 7, 8, 10, 18 and 19, with cell 1 having an additional nonreciprocal loss of chromosome 6. Cell 2 had a gain at chromosomes 15 and X. Cell 3 had a gain of chromosome 1 and losses of chromosomes 4 and X. Cell 5 had a loss of chromosome 9 and 17. Cell 6 had a gain of chromosomes 6, 8 and 9 and losses of chromosomes 15 and X. Cells 7 and 8 had a loss of chromosome X. (b) Genome-wide expression binned per chromosome comparing the cells from embryo E (reversine-treated, shown in red) with those from control embryos (n = 16 cells, untreated and shown in blue). Cells displaying concordantly higher and lower overall expression per chromosome are highlighted with an asterisk. For all box plots, the lower and upper boundaries of the box represent, respectively, the 25th and 75th percentiles, with the bar being equal to the median. The whiskers represent the 5th and 95th percentiles.

Supplementary Figure 10 Relationship between chromosomal-arm copy number and chromosome-arm-wide expression in iPSC-derived neurons.

MA plot comparing the log2 ratio in mRNA expression levels between p and q chromosomal arms (M) to the average expression across the chromosome arms (A) for all cells containing trisomy 21. The acrocentric chromosomes 13, 14, 15, 21 and 22 and chromosome Y have been excluded. The values for chromosome 20 are shown in green for cells without evidence of gain or loss of the chromosomal arms, and cells with genomic evidence for loss of 20p and gain of 20q are shown in purple. Numbers indicate cell identifiers.

Supplementary Figure 11 Detection of a coding interchromosomal fusion in the genome and transcriptome of a single cell.

(a) Identification of a fusion transcript in the RNA-seq data from a single HCC38 cell (cell 63). A subset of the reads mapping to a fusion between exon 6 of MTAP (gene locus on chromosome 9) and exon 3 of PCDH7 (gene locus on chromosome 4) are shown. (b) Sequencing of single-cell cDNA using the PacBio RSII revealed that the full-length MTAP-PCDH7 fusion transcript consisted of exons 1–6 of MTAP and exons 3, 4 and 6 of PCDH7. Six mapped reads following single-molecule PacBio cDNA sequencing of a single cell are shown. (c) Illumina HiSeq X DNA sequence reads crossing the causative interchromosomal fusion between chromosomes 4 and 9 in the genome of the same single cell (HCC38 cell 63). A subset of the reads mapping across the genomic fusion are shown; the breakpoint itself is located at a distance of 3,208 bases downstream of exon 6 of MTAP and 105,180 bases upstream from exon 3 of PCDH7.

Supplementary Figure 12 Confirmation of MTAP-PCDH7 expression and detection of the associated genomic fusion by qPCR.

Taqman primer and probe sets were designed to detect (a) the MTAP-PCDH7 fusion transcript and (b) the genomic breakpoint that fuses chromosomes 4 and 9. Examples of consensus reads mapping across both breakpoints are shown, with the MTAP side colored red and the PCDH7 side colored blue. Primer/probe sets were specifically designed to span the breakpoints in both cases. (c) Detection of the MTAP-PCDH7 fusion transcript in cDNA from G&T-seq of HCC38 and HCC38-BL cells. (d) Detection of the MTAP-PCDH7 genomic fusion in MDA-amplified DNA from G&T-seq of HCC38 and HCC38-BL cells. (e) Venn diagram showing the overlap of detection of the fusion transcript and associated genomic rearrangement in parallel from the same single cells.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–12, and Supplementary Tables 1 and 2 (PDF 17785 kb)

Supplementary Data 1

Excel spreadsheet containing sequencing and QC metrics for all cells presented in the paper. (XLSX 86 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Macaulay, I., Haerty, W., Kumar, P. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat Methods 12, 519–522 (2015). https://doi.org/10.1038/nmeth.3370

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing