Efficient targeted transcript discovery via array-based normalization of RACE libraries

Djebali, Sarah; Kapranov, Philipp; Foissac, Sylvain; Lagarde, Julien; Reymond, Alexandre; Ucla, Catherine; Wyss, Carine; Drenkow, Jorg; Dumais, Erica; Murray, Ryan R; Lin, Chenwei; Szeto, David; Denoeud, France; Calvo, Miquel; Frankish, Adam; Harrow, Jennifer; Makrythanasis, Periklis; Vidal, Marc; Salehi-Ashtiani, Kourosh; Antonarakis, Stylianos E; Gingeras, Thomas R; Guigó, Roderic

doi:10.1038/nmeth.1216

Article
Published: 25 May 2008

Efficient targeted transcript discovery via array-based normalization of RACE libraries

Sarah Djebali¹^na1,
Philipp Kapranov²^na1,
Sylvain Foissac³^na1,
Julien Lagarde¹^na1,
Alexandre Reymond⁴^na1,
Catherine Ucla⁵,
Carine Wyss⁵,
Jorg Drenkow²,
Erica Dumais²,
Ryan R Murray⁶,
Chenwei Lin⁶,
David Szeto⁶,
France Denoeud¹,
Miquel Calvo⁷,
Adam Frankish⁸,
Jennifer Harrow⁸,
Periklis Makrythanasis⁵,
Marc Vidal⁶,
Kourosh Salehi-Ashtiani⁶,
Stylianos E Antonarakis⁵,
Thomas R Gingeras² &
…
Roderic Guigó^1,3

Nature Methods volume 5, pages 629–635 (2008)Cite this article

314 Accesses
30 Citations
3 Altmetric
Metrics details

Abstract

Rapid amplification of cDNA ends (RACE) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. To improve sampling efficiency of human transcripts, we hybridized the products of the RACE reaction onto tiling arrays and used the detected exons to delineate a series of reverse-transcriptase (RT)-PCRs, through which the original RACE transcript population was segregated into simpler transcript populations. We independently cloned the products and sequenced randomly selected clones. This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Strategy for comprehensive characterization of new isoforms of annotated genes.**

**Figure 2: Examples of new RACEfrags verified by RT-PCR, cloning and sequencing.**

**Figure 3: Genomic coverage of RACEfrags originating from different tissues or combinations of tissues.**

**Figure 4: Absolute number and cumulative proportion of projected RACEfrags originating from index exons.**

**Figure 5: Distribution of distances of RACEfrags to assigned index exons.**

A multi-sample approach increases the accuracy of transcript assembly

Article Open access 01 November 2019

Context-aware transcript quantification from long-read RNA-seq data with Bambu

Article 12 June 2023

Direct RNA targeted in situ sequencing for transcriptomic profiling in tissue

Article Open access 13 May 2022

Accession codes

Accessions

Gene Expression Omnibus

GSE11433

References

Adams, M.D., Soares, M.B., Kerlavage, A.R., Fields, C. & Venter, J.C. Rapid cDNA sequencing (expressed sequence tags) from a directionally cloned human infant brain cDNA library. Nat. Genet. 4, 373–380 (1993).
Article CAS Google Scholar
Gerhard, D.S. et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 14, 2121–2127 (2004).
Article Google Scholar
Kawai, J. et al. Functional annotation of a full-length mouse cDNA collection. Nature 409, 685–690 (2001).
Article Google Scholar
Carninci, P. et al. The transcriptional landscape of the mammalian genome. Science 309, 1559–1563 (2005).
Article CAS Google Scholar
Bonaldo, M.F., Lennon, G. & Soares, M.B. Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res. 6, 791–806 (1996).
Article CAS Google Scholar
Soares, M.B. et al. Construction and characterization of a normalized cDNA library. Proc. Natl. Acad. Sci. USA 91, 9228–9232 (1994).
Article CAS Google Scholar
Thill, G. et al. ASEtrap: a biological method for speeding up the exploration of spliceomes. Genome Res. 16, 776–786 (2006).
Article CAS Google Scholar
Watahiki, A. et al. Libraries enriched for alternatively spliced exons reveal splicing patterns in melanocytes and melanomas. Nat. Methods 1, 233–239 (2004).
Article Google Scholar
Harrow, J. et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 7 (Suppl 1), S4.1–S4.9 (2006).
Article Google Scholar
Shiraki, T. et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).
Article CAS Google Scholar
Ng, P. et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat. Methods 2, 105–111 (2005).
Article CAS Google Scholar
Peters, L.M. et al. Signatures from tissue-specific MPSS libraries identify transcripts preferentially expressed in the mouse inner ear. Genomics 89, 197–206 (2007).
Article CAS Google Scholar
Roma, G. et al. A novel view of the transcriptome revealed from gene trapping in mouse embryonic stem cells. Genome Res. 17, 1051–1060 (2007).
Article CAS Google Scholar
Kapranov, P. et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316, 1484–1488 (2007).
Article CAS Google Scholar
Birney, E. et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).
Article CAS Google Scholar
Denoeud, F. et al. Prominent use of distal 5′ transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genome Res. 17, 746–759 (2007).
Article CAS Google Scholar
Kapranov, P. et al. Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. Genome Res. 15, 987–997 (2005).
Article CAS Google Scholar
Frohman, M.A., Dush, M.K. & Martin, G.R. Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. USA 85, 8998–9002 (1988).
Article CAS Google Scholar
Reymond, A. et al. Human chromosome 21 gene expression atlas in the mouse. Nature 420, 582–586 (2002).
Article CAS Google Scholar
The ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004).
The ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).
Kodzius, R. et al. CAGE: cap analysis of gene expression. Nat. Methods 3, 211–222 (2006).
Article CAS Google Scholar
Parra, G. et al. Tandem chimerism as a means to increase protein complexity in the human genome. Genome Res. 16, 37–44 (2006).
Article CAS Google Scholar
Parra, G., Blanco, E. & Guigo, R. GeneID in Drosophila. Genome Res. 10, 511–515 (2000).
Article CAS Google Scholar
Rozen, S. & Skaletsky, H. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132, 365–386 (2000).
CAS PubMed Google Scholar

Download references

Acknowledgements

The project at Institut Municipal d'Investigació Mèdica, Center for Genomic Regulation (CRG), the Universities of Lausanne and Geneva, and Affymetrix was supported by grants U01HG003150 and U01HG003147 from the US National Human Genome Research Institute, National Institutes of Health; at IMIM and CRG also funded by grant BIO2006-03380 from the Spanish Ministry of Education and Science and from the European BioSapiens Consortium; at the Universities of Lausanne and Geneva also funded by the Swiss National Science Foundation, the EU AnEUploidy project and the National Center of Competence in Research Frontiers in Genetics; and at Affymetrix also funded by the National Cancer Institute, National Institutes of Health (N01-CO-12400) and by Affymetrix, Inc. The portion of this work carried out at Center for Cancer Systems Biology was funded by a grant from the Ellison Foundation (to M.V.) and as Institute Sponsored Research from the Dana Farber Cancer Institute Strategic Initiative. We acknowledge J.M. Oller for reviewing the probabilistic results and R. Castelo, C. Howald and D. Martin for useful suggestions.

Author information

Sarah Djebali, Philipp Kapranov, Sylvain Foissac, Julien Lagarde and Alexandre Reymond: These authors contributed equally to this work.

Authors and Affiliations

Grup de Recerca en Informàtica Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, Dr. Aiguader 88, Barcelona, 08003, Spain
Sarah Djebali, Julien Lagarde, France Denoeud & Roderic Guigó
Affymetrix, Inc., 3420 Central Expressway, Santa Clara, 95051, California, USA
Philipp Kapranov, Jorg Drenkow, Erica Dumais & Thomas R Gingeras
Center for Genomic Regulation, Dr. Aiguader 88, Barcelona, 08003, Spain
Sylvain Foissac & Roderic Guigó
Center for Integrative Genomics, University of Lausanne, Genopole Building, Lausanne, 1015, Switzerland
Alexandre Reymond
Department of Genetic Medicine and Development, University of Geneva Medical School, 1 rue Michel Servet, Geneva, 1211, Switzerland
Catherine Ucla, Carine Wyss, Periklis Makrythanasis & Stylianos E Antonarakis
Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute and Department of Genetics, Harvard Medical School, 44 Binney Street, Boston, 02115, Massachusetts, USA
Ryan R Murray, Chenwei Lin, David Szeto, Marc Vidal & Kourosh Salehi-Ashtiani
Departament d'Estadística, Universitat de Barcelona, Diagonal 645, Barcelona, 08028, Spain
Miquel Calvo
Human and Vertebrate Analysis and Annotation Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1HH, UK
Adam Frankish & Jennifer Harrow

Authors

Sarah Djebali
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Kapranov
View author publications
You can also search for this author in PubMed Google Scholar
Sylvain Foissac
View author publications
You can also search for this author in PubMed Google Scholar
Julien Lagarde
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Reymond
View author publications
You can also search for this author in PubMed Google Scholar
Catherine Ucla
View author publications
You can also search for this author in PubMed Google Scholar
Carine Wyss
View author publications
You can also search for this author in PubMed Google Scholar
Jorg Drenkow
View author publications
You can also search for this author in PubMed Google Scholar
Erica Dumais
View author publications
You can also search for this author in PubMed Google Scholar
Ryan R Murray
View author publications
You can also search for this author in PubMed Google Scholar
Chenwei Lin
View author publications
You can also search for this author in PubMed Google Scholar
David Szeto
View author publications
You can also search for this author in PubMed Google Scholar
France Denoeud
View author publications
You can also search for this author in PubMed Google Scholar
Miquel Calvo
View author publications
You can also search for this author in PubMed Google Scholar
Adam Frankish
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer Harrow
View author publications
You can also search for this author in PubMed Google Scholar
Periklis Makrythanasis
View author publications
You can also search for this author in PubMed Google Scholar
Marc Vidal
View author publications
You can also search for this author in PubMed Google Scholar
Kourosh Salehi-Ashtiani
View author publications
You can also search for this author in PubMed Google Scholar
Stylianos E Antonarakis
View author publications
You can also search for this author in PubMed Google Scholar
Thomas R Gingeras
View author publications
You can also search for this author in PubMed Google Scholar
Roderic Guigó
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.R.G., S.E.A., A.R., P.K. and R.G. participated in the overall design of the experiments and the subsequent analysis. A.R., C.U., C.W., P.M. and S.E.A. performed the RACE reactions. J.D., E.D. and P.K. performed the hybridization of the RACE reactions into tiling arrays. R.R.M., C.L., D.S., K.S.-A. and M.V. carried out the RT-PCRs, the cloning and sequencing of candidates. S.D., S.F., J.L., F.D. and R.G. developed software and carried out the bioinformatics analysis. M.C. developed the theoretical model for sampling and carried out the computational simulations. A.F. and J.H. provided the reference gene annotation and helped map the RT-PCR sequences to the genome.

Corresponding author

Correspondence to Roderic Guigó.

Ethics declarations

Competing interests

P.K., J.D., E.D. and T.R.G. are Affymetrix employees.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1–2, Supplementary Methods, Supplementary Results (PDF 1195 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Djebali, S., Kapranov, P., Foissac, S. et al. Efficient targeted transcript discovery via array-based normalization of RACE libraries. Nat Methods 5, 629–635 (2008). https://doi.org/10.1038/nmeth.1216

Download citation

Received: 12 March 2008
Accepted: 24 April 2008
Published: 25 May 2008
Issue Date: July 2008
DOI: https://doi.org/10.1038/nmeth.1216

This article is cited by

Evidence for widespread existence of functional novel and non-canonical human transcripts
- Dongyang Xu
- Lu Tang
- Philipp Kapranov
BMC Biology (2023)
ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data
- Bernardo Rodríguez-Martín
- Emilio Palumbo
- Sarah Djebali
BMC Genomics (2017)
Genome-wide Identification and Characterization of Natural Antisense Transcripts by Strand-specific RNA Sequencing in Ganoderma lucidum
- Junjie Shao
- Haimei Chen
- Chang Liu
Scientific Reports (2017)
Intron retention and transcript chimerism conserved across mammals: Ly6g5b and Csnk2b-Ly6g5b as examples
- Francisco Hernández-Torres
- Alberto Rastrojo
- Begoña Aguado
BMC Genomics (2013)
Genome-wide functional annotation and structural verification of metabolic ORFeome of Chlamydomonas reinhardtii
- Lila Ghamsari
- Santhanam Balaji
- Kourosh Salehi-Ashtiani
BMC Genomics (2011)

Efficient targeted transcript discovery via array-based normalization of RACE libraries

Abstract

Access options

Similar content being viewed by others

A multi-sample approach increases the accuracy of transcript assembly

Context-aware transcript quantification from long-read RNA-seq data with Bambu

Direct RNA targeted in situ sequencing for transcriptomic profiling in tissue

Accession codes

Accessions

Gene Expression Omnibus

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Supplementary Text and Figures

Rights and permissions

About this article

Cite this article

This article is cited by

Evidence for widespread existence of functional novel and non-canonical human transcripts

ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data

Genome-wide Identification and Characterization of Natural Antisense Transcripts by Strand-specific RNA Sequencing in Ganoderma lucidum

Intron retention and transcript chimerism conserved across mammals: Ly6g5b and Csnk2b-Ly6g5b as examples

Genome-wide functional annotation and structural verification of metabolic ORFeome of Chlamydomonas reinhardtii

Hunting hidden transcripts

Search

Quick links

Abstract

Access options

Similar content being viewed by others

Accession codes

Accessions

Gene Expression Omnibus

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links