Rapid amplification of cDNA ends (RACE) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. To improve sampling efficiency of human transcripts, we hybridized the products of the RACE reaction onto tiling arrays and used the detected exons to delineate a series of reverse-transcriptase (RT)-PCRs, through which the original RACE transcript population was segregated into simpler transcript populations. We independently cloned the products and sequenced randomly selected clones. This approach, RACEarray, is superior to direct cloning and sequencing of RACE products because it specifically targets new transcripts and often results in overall normalization of transcript abundance. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of new transcripts, and we investigated multiplexing the strategy by pooling RACE reactions from multiple interrogated loci before hybridization.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


Gene Expression Omnibus


  1. 1.

    , , , & Rapid cDNA sequencing (expressed sequence tags) from a directionally cloned human infant brain cDNA library. Nat. Genet. 4, 373–380 (1993).

  2. 2.

    et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 14, 2121–2127 (2004).

  3. 3.

    et al. Functional annotation of a full-length mouse cDNA collection. Nature 409, 685–690 (2001).

  4. 4.

    et al. The transcriptional landscape of the mammalian genome. Science 309, 1559–1563 (2005).

  5. 5.

    , & Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res. 6, 791–806 (1996).

  6. 6.

    et al. Construction and characterization of a normalized cDNA library. Proc. Natl. Acad. Sci. USA 91, 9228–9232 (1994).

  7. 7.

    et al. ASEtrap: a biological method for speeding up the exploration of spliceomes. Genome Res. 16, 776–786 (2006).

  8. 8.

    et al. Libraries enriched for alternatively spliced exons reveal splicing patterns in melanocytes and melanomas. Nat. Methods 1, 233–239 (2004).

  9. 9.

    et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 7 (Suppl 1), S4.1–S4.9 (2006).

  10. 10.

    et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).

  11. 11.

    et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat. Methods 2, 105–111 (2005).

  12. 12.

    et al. Signatures from tissue-specific MPSS libraries identify transcripts preferentially expressed in the mouse inner ear. Genomics 89, 197–206 (2007).

  13. 13.

    et al. A novel view of the transcriptome revealed from gene trapping in mouse embryonic stem cells. Genome Res. 17, 1051–1060 (2007).

  14. 14.

    et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316, 1484–1488 (2007).

  15. 15.

    et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  16. 16.

    et al. Prominent use of distal 5′ transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genome Res. 17, 746–759 (2007).

  17. 17.

    et al. Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. Genome Res. 15, 987–997 (2005).

  18. 18.

    , & Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. USA 85, 8998–9002 (1988).

  19. 19.

    et al. Human chromosome 21 gene expression atlas in the mouse. Nature 420, 582–586 (2002).

  20. 20.

    The ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636–640 (2004).

  21. 21.

    The ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  22. 22.

    et al. CAGE: cap analysis of gene expression. Nat. Methods 3, 211–222 (2006).

  23. 23.

    et al. Tandem chimerism as a means to increase protein complexity in the human genome. Genome Res. 16, 37–44 (2006).

  24. 24.

    , & GeneID in Drosophila. Genome Res. 10, 511–515 (2000).

  25. 25.

    & Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132, 365–386 (2000).

Download references


The project at Institut Municipal d'Investigació Mèdica, Center for Genomic Regulation (CRG), the Universities of Lausanne and Geneva, and Affymetrix was supported by grants U01HG003150 and U01HG003147 from the US National Human Genome Research Institute, National Institutes of Health; at IMIM and CRG also funded by grant BIO2006-03380 from the Spanish Ministry of Education and Science and from the European BioSapiens Consortium; at the Universities of Lausanne and Geneva also funded by the Swiss National Science Foundation, the EU AnEUploidy project and the National Center of Competence in Research Frontiers in Genetics; and at Affymetrix also funded by the National Cancer Institute, National Institutes of Health (N01-CO-12400) and by Affymetrix, Inc. The portion of this work carried out at Center for Cancer Systems Biology was funded by a grant from the Ellison Foundation (to M.V.) and as Institute Sponsored Research from the Dana Farber Cancer Institute Strategic Initiative. We acknowledge J.M. Oller for reviewing the probabilistic results and R. Castelo, C. Howald and D. Martin for useful suggestions.

Author information

Author notes

    • Sarah Djebali
    • , Philipp Kapranov
    • , Sylvain Foissac
    • , Julien Lagarde
    •  & Alexandre Reymond

    These authors contributed equally to this work.


  1. Grup de Recerca en Informàtica Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain.

    • Sarah Djebali
    • , Julien Lagarde
    • , France Denoeud
    •  & Roderic Guigó
  2. Affymetrix, Inc., 3420 Central Expressway, Santa Clara, California 95051, USA.

    • Philipp Kapranov
    • , Jorg Drenkow
    • , Erica Dumais
    •  & Thomas R Gingeras
  3. Center for Genomic Regulation, Dr. Aiguader 88, 08003 Barcelona, Spain.

    • Sylvain Foissac
    •  & Roderic Guigó
  4. Center for Integrative Genomics, University of Lausanne, Genopole Building, 1015 Lausanne, Switzerland.

    • Alexandre Reymond
  5. Department of Genetic Medicine and Development, University of Geneva Medical School, 1 rue Michel Servet, 1211 Geneva, Switzerland.

    • Catherine Ucla
    • , Carine Wyss
    • , Periklis Makrythanasis
    •  & Stylianos E Antonarakis
  6. Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute and Department of Genetics, Harvard Medical School, 44 Binney Street, Boston, Massachusetts 02115, USA.

    • Ryan R Murray
    • , Chenwei Lin
    • , David Szeto
    • , Marc Vidal
    •  & Kourosh Salehi-Ashtiani
  7. Departament d'Estadística, Universitat de Barcelona, Diagonal 645, 08028 Barcelona, Spain.

    • Miquel Calvo
  8. Human and Vertebrate Analysis and Annotation Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1HH, UK.

    • Adam Frankish
    •  & Jennifer Harrow


  1. Search for Sarah Djebali in:

  2. Search for Philipp Kapranov in:

  3. Search for Sylvain Foissac in:

  4. Search for Julien Lagarde in:

  5. Search for Alexandre Reymond in:

  6. Search for Catherine Ucla in:

  7. Search for Carine Wyss in:

  8. Search for Jorg Drenkow in:

  9. Search for Erica Dumais in:

  10. Search for Ryan R Murray in:

  11. Search for Chenwei Lin in:

  12. Search for David Szeto in:

  13. Search for France Denoeud in:

  14. Search for Miquel Calvo in:

  15. Search for Adam Frankish in:

  16. Search for Jennifer Harrow in:

  17. Search for Periklis Makrythanasis in:

  18. Search for Marc Vidal in:

  19. Search for Kourosh Salehi-Ashtiani in:

  20. Search for Stylianos E Antonarakis in:

  21. Search for Thomas R Gingeras in:

  22. Search for Roderic Guigó in:


T.R.G., S.E.A., A.R., P.K. and R.G. participated in the overall design of the experiments and the subsequent analysis. A.R., C.U., C.W., P.M. and S.E.A. performed the RACE reactions. J.D., E.D. and P.K. performed the hybridization of the RACE reactions into tiling arrays. R.R.M., C.L., D.S., K.S.-A. and M.V. carried out the RT-PCRs, the cloning and sequencing of candidates. S.D., S.F., J.L., F.D. and R.G. developed software and carried out the bioinformatics analysis. M.C. developed the theoretical model for sampling and carried out the computational simulations. A.F. and J.H. provided the reference gene annotation and helped map the RT-PCR sequences to the genome.

Competing interests

P.K., J.D., E.D. and T.R.G. are Affymetrix employees.

Corresponding author

Correspondence to Roderic Guigó.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–8, Supplementary Tables 1–2, Supplementary Methods, Supplementary Results

About this article

Publication history






Further reading