A spruce sequence


The first published whole-genome draft sequence of a gymnosperm, the Norway spruce, provides a powerful platform for studying the unique development, adaptation and evolution of this major group of plants. See Article p.579

Within the gymnosperm subgroup of seed plants are the iconic conifers, which dominate many forest ecosystems of the cold-temperate and subtropical regions of the Northern Hemisphere. Among the conifers are Earth's oldest living individual plants, the bristlecone pines; its largest trees, the giant sequoias; and its tallest, the coast redwoods. Conifers also include the pine and spruce genera that supply much of the world's wood for pulp, paper and solid-wood products. Genetic analysis is the key to understanding the biology of these trees, but gymnosperms typically have very large genomes, of up to 37 gigabases1, and an abundance of repetitive DNA, making their sequences difficult to assemble. So the sequence of the Norway spruce genome, reported by Nystedt et al.2 on page 579 of this issue, represents a major technical, as well as an information-rich, achievementFootnote 1.

Gymnosperms, of which there are around 1,026 species3, are vascular plants, meaning that they contain a tubular tissue network that transports water and nutrients, and provides mechanical support. The gymnosperms are one of two subgroups of seed-forming vascular plants, the other being the angiosperms, the flowering plants. There are some 350,000 species of angiosperm, some woody and others herbaceous, and this subgroup includes all our major food crops.

The seeds of angiosperms are enclosed in an ovary, whereas those of gymnosperms are in an open state. The two subgroups also differ in terms of their mechanisms of growth, development, metabolism, adaptation and evolution4 — in factors including wood microanatomy, water transport, mechanical support, reproduction, development and ability to adapt to environmental change. The vascular system of conifers depends on long, thin cells with lateral pits called tracheids, a primitive system also found in early progymnosperm fossils. Angiosperms typically have more complex wood-cell anatomy, in which vessels with much larger diameters facilitate water transport and more-specialized fibre cells give mechanical support. The phenolic polymer lignin contributes to mechanical support and provides a hydrophobic surface for water transport in both angiosperms and gymnosperms, but the composition of the polymer differs markedly between the two groups of plants.

Identifying the genes that underlie these differences is of interest for both basic and applied research, but the long reproductive cycles and large sizes of gymnosperms have made traditional, breeding-based analyses of these plants challenging. DNA-based technology that can bypass these limitations has been particularly useful in forest trees, enabling genomic mapping, gene sequencing, genomic selection and genetic engineering. Whole-genome sequences are particularly powerful, because they provide a platform for a spectrum of new technologies, such as studies of an organism's full transcriptome — its complement of transcribed RNA molecules. Nystedt and colleagues' draft sequence is the first gymnosperm sequence to be published (Fig. 1), and a loblolly pine genome sequence (around 22 Gb) is expected to follow soon5. These conifer genomes are the largest plant genomes sequenced so far, much larger than the 17-kb genome of wheat6 (an angiosperm).

Figure 1: Gymnosperm genes.


Trees of the gymnosperm subgroup make up much of the forests of the Northern Hemisphere and provide a large fraction of the world's wood. Nystedt et al.2 have presented the first draft whole-genome sequence of a gymnosperm, the Norway spruce (Picea abies).

The spruce genome will not only accelerate the investigation of gymnosperm biology, it will also provide broader genetic and evolutionary insight. For example, researchers of the ENCODE project7 recently argued that 70% of the 3.2-Gb human genome is functional in some way. But large plant and animal genomes pose a challenge to this proposal. Consider, for instance, the spruce genome (around 20 Gb) and that of thale cress, the model plant Arabidopsis thaliana (0.135 Gb). If a similarly high proportion of these genomes were functional, what properties of gymnosperms could necessitate such a massively larger number of functional genetic elements? Repression of only two genes enables A. thaliana to extend its growth cycle and produce a substantial amount of wood8, suggesting that the large genomes of gymnosperms are not attributable to a perennial growth habit or wood formation. Early comparisons of transcribed genes indicated9 that the vast majority of genes in gymnosperms and angiosperms have homologues in the same gene families. Nystedt and colleagues estimate that the spruce genome contains around 28,354 genes, which is very close to A. thaliana's 27,407. The angiosperms maize (corn), rice and poplar have estimated gene numbers of around 40,000. But these coding regions comprise a fraction of the sequence of these large genomes, and what the functions of the remaining sequences may be is still obscure.

Another curious aspect of gymnosperm genomes is the evolutionary conservation, in many species, of a haploid (single copy) chromosome number of 12, despite their genome sizes ranging from 9.7 to 37 Gb1. The genomes will also provide insight into mechanisms of ancient and recent evolutionary adaptation in plants. Gymnosperms are thought to have originated from progymnosperms 360 million years ago, but much about the origin of angiosperms remains a mystery. Although the fossil record supports a gymnosperm origin for angiosperms, estimates of the time of angiosperm origins vary by more than 100 million years. There is also controversy regarding relationships within the gymnosperms, particularly the relationship of the conifers to the gnetophytes, which include the bizarre desert plant Welwitschia.

The genome sequences of these trees will not only help us to understand the past, but may also increase our understanding of present-day northern-latitude forests. Gymnosperms became the dominant forest plants in the late Palaeozoic and the Mesozoic periods, around 300 million to 70 million years ago. But during the most recent ice age, much of the northern latitudes were covered by ice. When the glaciers last retreated, only about 10,000 years ago, the conifers were the major pioneer species that dominated that land. Understanding how the gymnosperms established new ecosystems as the glaciers shrank is becoming more important as we anticipate the effects of global climate change on the world's forests.


  1. 1.

    *This article and the paper under discussion2 were published online on 22 May 2013.


  1. 1

    Ahuja, M. R. & Neale D. B. Silvae Genetica 54, 126–137 (2005).

  2. 2

    Nystedt, B. et al. Nature 497, 579–584 (2013).

  3. 3

    Christenhusz, M. J. M. et al. Phytotaxa 19, 55–70 (2011).

  4. 4

    Beck, C. B. Origin and Evolution of Gymnosperms (Columbia Univ. Press, 1988).

  5. 5

    Pine Reference Sequences

  6. 6

    Brenchley, R. et al. Nature 491, 705–710 (2012).

  7. 7

    Dunham, I. et al. Nature 489, 57–74 (2012).

  8. 8

    Melzer, S. et al. Nature Genet. 40, 1489–1492 (2008).

  9. 9

    Kirst, M. et al. Proc. Natl Acad. Sci. USA 100, 7383–7388 (2003).

Download references

Author information



Corresponding author

Correspondence to Ronald Sederoff.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Sederoff, R. A spruce sequence. Nature 497, 569–570 (2013).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.