Paired DNA and RNA profiling is increasingly employed in genomics research to uncover molecular mechanisms of disease and to explore personal genotype and phenotype correlations. Here, we introduce Simul-seq, a technique for the production of high-quality whole-genome and transcriptome sequencing libraries from small quantities of cells or tissues. We apply the method to laser-capture-microdissected esophageal adenocarcinoma tissue, revealing a highly aneuploid tumor genome with extensive blocks of increased homozygosity and corresponding increases in allele-specific expression. Among this widespread allele-specific expression, we identify germline polymorphisms that are associated with response to cancer therapies. We further leverage this integrative data to uncover expressed mutations in several known cancer genes as well as a recurrent mutation in the motor domain of KIF3B that significantly affects kinesin–microtubule interactions. Simul-seq provides a new streamlined approach for generating comprehensive genome and transcriptome profiles from limited quantities of clinically relevant samples.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Sequence Read Archive
Shah, S.P. et al. The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature 486, 395–399 (2012).
Grubert, F. et al. Genetic control of chromatin states in humans involves local and distal chromosomal interactions. Cell 162, 1051–1065 (2015).
Stranger, B.E. et al. Population genomics of human gene expression. Nat. Genet. 39, 1217–1224 (2007).
Ongen, H. et al. Putative cis-regulatory drivers in colorectal cancer. Nature 512, 87–90 (2014).
Li, J.B. et al. Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing. Science 324, 1210–1213 (2009).
Tuch, B.B. et al. Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLoS One 5, e9317 (2010).
Su, A.I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101, 6062–6067 (2004).
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
Macaulay, I.C. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12, 519–522 (2015).
Dey, S.S., Kester, L., Spanjaard, B. & Van, A. Integrated genome and transcriptome sequencing from the same cell. Nat. Biotechnol. 33, 1–19 (2015).
Lam, H.Y.K. et al. Performance comparison of whole-genome sequencing platforms. Nat. Biotechnol. 30, 78–82 (2011).
Adey, A. et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010).
Baker, S.C. et al. The External RNA Controls Consortium: a progress report. Nat. Methods 2, 731–734 (2005).
Lawrence, M.S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Weinstein, J.N. et al. Cancer Genome Atlas Research Network. Comprehensive molecular characterization of urothelial bladder carcinoma. Nature 507, 315–322 (2014).
Zhao, M., Kim, P., Mitra, R., Zhao, J. & Zhao, Z. TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genes. Nucleic Acids Res. 4, D1023–D1031 (2015).
Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Yadav, M. et al. Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature 515, 572–576 (2014).
Robbins, P.F. et al. Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells. Nat. Med. 19, 747–752 (2013).
Schumacher, T.N. & Schreiber, R.D. Neoantigens in cancer immunotherapy. Science 348, 69–74 (2015).
Futreal, P.A. et al. A census of human cancer genes. Nat. Rev. Cancer 4, 177–183 (2004).
Joerger, A.C., Ang, H.C. & Fersht, A.R. Structural basis for understanding oncogenic p53 mutations and designing rescue drugs. Proc. Natl. Acad. Sci. USA 103, 15056–15061 (2006).
Bullock, A.N., Henckel, J. & Fersht, A.R. Quantitative analysis of residual folding and DNA binding in mutant p53 core domain: definition of mutant states for rescue in cancer therapy. Oncogene 19, 1245–1256 (2000).
Gautschi, O. et al. Cyclin D1 (CCND1) A870G gene polymorphism modulates smoking-induced lung cancer risk and response to platinum-based chemotherapy in non-small cell lung cancer (NSCLC) patients. Lung Cancer 51, 303–311 (2006).
Absenger, G. et al. The cyclin D1 (CCND1) rs9344 G>A polymorphism predicts clinical outcome in colon cancer patients treated with adjuvant 5-FU-based chemotherapy. Pharmacogenomics J. 14, 130–134 (2014).
Gonçalves, A. et al. A polymorphism of EGFR extracellular domain is associated with progression free-survival in metastatic colorectal cancer patients receiving cetuximab-based treatment. BMC Cancer 8, 169 (2008).
Hsieh, Y.Y., Tzeng, C.H., Chen, M.H., Chen, P.M. & Wang, W.S. Epidermal growth factor receptor R521K polymorphism shows favorable outcomes in KRAS wild-type colorectal cancer patients treated with cetuximab-based chemotherapy. Cancer Sci. 103, 791–796 (2012).
Yu, Y. & Feng, Y.-M. The role of kinesin family proteins in tumorigenesis and progression: potential biomarkers and molecular targets for cancer therapy. Cancer 116, 5150–5160 (2010).
Jimbo, T. et al. Identification of a link between the tumour suppressor APC and the kinesin superfamily. Nat. Cell Biol. 4, 323–327 (2002).
Woehlke, G. et al. Microtubule interaction site of the kinesin motor. Cell 90, 207–216 (1997).
Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Integrated genome and transcriptome sequencing of the same cell. Nat. Biotechnol. 33, 285–289 (2015).
Macaulay, I.C. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12, 519–522 (2015).
Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013).
Zhao, W. et al. Comparison of RNA-Seq by poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics 15, 419 (2014).
Nones, K. et al. Genomic catastrophes frequently arise in esophageal adenocarcinoma and drive tumorigenesis. Nat. Commun. 5, 5224 (2014).
Agrawal, N. et al. Comparative genomic analysis of esophageal adenocarcinoma and squamous cell carcinoma. Cancer Discov. 2, 899–905 (2012).
Dulak, A.M. et al. Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nat. Genet. 45, 478–486 (2013).
Haraguchi, K., Hayashi, T., Jimbo, T., Yamamoto, T. & Akiyama, T. Role of the kinesin-2 family protein, KIF3, during mitosis. J. Biol. Chem. 281, 4094–4099 (2006).
Liu, X. et al. Small molecule induced reactivation of mutant p53 in cancer cells. Nucleic Acids Res. 41, 6034–6044 (2013).
Stachler, M.D. et al. Paired exome analysis of Barrett's esophagus and adenocarcinoma. Nat. Genet. 47, 1047–1055 (2015).
Moriai, T., Kobrin, M.S., Hope, C., Speck, L. & Korc, M. A variant epidermal growth factor receptor exhibits altered type alpha transforming growth factor binding and transmembrane signaling. Proc. Natl. Acad. Sci. USA 91, 10217–10221 (1994).
Zhang, W. et al. Cyclin D1 and epidermal growth factor polymorphisms associated with survival in patients with advanced colorectal cancer treated with Cetuximab. Pharmacogenet. Genomics 16, 475–483 (2006).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Langmead, B. & Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185 (2012).
Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Liao, Y., Smyth, G.K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Roth, A. et al. JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data. Bioinformatics 28, 907–913 (2012).
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).
Larson, D.E. et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28, 311–317 (2012).
Koboldt, D.C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Wang, J. et al. CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat. Methods 8, 652–654 (2011).
Zhang, J. et al. INTEGRATE: gene fusion discovery using whole genome and transcriptome data. Genome Res. 26, 108–118 (2016).
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Cingolani, P. et al. Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift. Front. Genet. 3, 35 (2012).
Romanel, A., Lago, S., Prandi, D., Sboner, A. & Demichelis, F. ASEQ: fast allele-specific studies from next-generation sequencing data. BMC Med. Genomics 8, 9 (2015).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Stock, M.F. & Hackney, D.D. Expression of kinesin in Escherichia coli. Methods Mol. Biol. 164, 43–48 (2001).
We thank C. Araya, C. Cenik, P. Dumesic, D. Phanstiel and D. Webster for many helpful discussions and input regarding the manuscript and analyses. We acknowledge J. Churko from the laboratory of J. Wu at Stanford University for providing the fibroblasts as well as the work of both the sequencing core at the Stanford Center for Genomics and Personalized Medicine and the Genetics Bioinformatics Service Center, with special thanks to G. Euskirchen, L. Ramirez, C. Eastman, N. Watson and N. Hammond. Finally, we would like to thank H. Chen from Bina Technologies.
M.P.S. is a cofounder of Personalis and a member of the scientific advisory boards of Personalis and Genapsys.
Integrated supplementary information
(a) Histogram of incubation times for parallel Tru-seq DNA and RNA library preparation as well as Simul-seq. (b) High-sensitivity DNA bioanalyzer trace for a yeast/human mixed Simul-seq library. Note, this trace is representative of an average Simul-seq library. (c-d) Representative droplet digital PCR (ddPCR) raw fluorescence amplitude data (left) and assay design (right) for quantification of DNA (c) and RNA (d) constituents of Simul-seq libraries.
(a-b) Venn diagrams comparing SNV (a) and indel (b) calls between the Simul-seq genome and two DNA-seq control genomes derived from different tissues of the same individual13.
Bar graph of all Ensembl biotype annotations for genes with FPKM values greater than or equal to 5.
Scatter plots of Log10(FPKM+1) gene measurements for Simul-seq and RNA-seq replicates. Spearman’s ρ correlation values for each comparison are shown.
(a) Coverage distributions for Simul-seq libraries of the same individual. (b) Venn diagrams comparing SNV calls between the Simul-seq replicates. (c) Scatter plots of Log10(FPKM+1) gene measurements for 50K Simul-seq replicates (Spearman’s ρ=0.97). (d) Correlation between External RNA Controls Consortium (ERCC) spike-in control Log10 RNA concentrations versus the average Log10(RPKM+1) for Simul-seq (blue; Spearman’s ρ=0.97) and 50K Simul-seq (orange; Spearman’s ρ=0.96) fibroblast replicates (n=2/group). Note, zero values have been shifted to 1, and all ERCC transcripts are shown.
(a) Distribution of normalized transcript coverage for RNA-seq and Simul-seq replicates performed on fibroblasts as well as Simul-seq data obtained for esophageal adenocarcinoma tissue isolated using laser capture microscopy (Simul-seq EAC). (b) Strand specificity of Simul-seq and RNA-seq samples. (c) The fraction of reads mapping to various genomic annotations for Simul-seq and RNA-seq samples. Note, an increased intronic read fraction combined with a similar intergenic read fraction in the Simul-seq EAC sample likely indicates increased intron retention and/or a higher proportion of unspliced RNA in this specimen.
Supplementary Figure 7 Targeted resequencing of KIF3B locus in esophageal adenocarcinoma patient samples.
(a) Histogram of the unique and unmapped Bowtie aligned reads obtained for 76 FFPE samples (50 tumors and 26 normals). The original sample (02-28923-C9) that was subjected to the Simul-seq protocol was included as a positive control. A single tumor-normal pair (00-18224-A2) displayed a substantially higher number of variant calls yet a lower number of uniquely mapped reads, suggesting that these samples harbored increased rates of PCR errors induced by low quality genomic DNA. Therefore, these samples were not included in somatic mutation analysis. (b) Validation of variant calls using pyrophosphate sequencing.
(a) Schematic of KIF3B protein, with motor domain and ATP binding region highlighted in blue and red, respectively. For biochemical assays, a region spanning the motor domain of KIF3B (amino acids 1-365) was cloned and recombinantly expressed with an N-terminal 6x-Histidine tag (bottom). (b) Coomassie stained gel of recombinant proteins pre- and post-induction with Isopropyl β-D-1-thiogalactopyranoside (IPTG) as well as after Ni2+ affinity purification.
Supplementary Figures 1–8 and Supplementary Note. (PDF 1853 kb)
Simul-seq and control library read counts and mapping rates. (XLSX 12 kb)
Somatic SVs for Simul-seq EAC tumor genome (XLSX 35 kb)
Somatic, expressed gene fusions in Simul-seq EAC tumor genome (XLSX 9 kb)
VCF of somatic SNVs for Simul-seq EAC tumor genome (XLSX 2379 kb)
VCF of somatic indels for Simul-seq EAC tumor genome (XLSX 474 kb)
EAC tumor ASE analysis at heterozygous SNV positions in the normal genome (XLSX 7583 kb)
ASE of annotated tumor supressor genes harboring damaging germline variants (XLSX 12 kb)
Simul-seq RNA and RNA-seq ERCC spike-in transcript quantification (XLSX 16 kb)
Genomic regions of KIF3B locus targeted for resequencing (XLSX 8 kb)
Primer sets used in KIF3B targeted resequencing (XLSX 11 kb)
About this article
Cite this article
Reuter, J., Spacek, D., Pai, R. et al. Simul-seq: combined DNA and RNA sequencing for whole-genome and transcriptome profiling. Nat Methods 13, 953–958 (2016). https://doi.org/10.1038/nmeth.4028
Identification of a novel gene signature in second-trimester amniotic fluid for the prediction of preterm birth
Scientific Reports (2022)
Genome Biology (2021)
International Journal of Legal Medicine (2020)
Quantification of allelic differential expression using a simple Fluorescence primer PCR-RFLP-based method
Scientific Reports (2019)
Nature Protocols (2018)