The wheat genome is large and complex, and has defied complete sequencing. But the most comprehensive analysis so far of the plant's genes will support efforts to optimize the supply of this vital food crop. See Letter p.705
The bread wheat genome presents a significant challenge to researchers. At 17 gigabases, it is about six times the size of the human genome, and it is hexaploid, meaning that it contains six sets of chromosomes, which derive from three different genomes. So why bother to sequence such a difficult genome? Wheat is arguably the most important plant to humans. Bread wheat (Triticum aestivum) is the world's most widely grown crop, covering more than 200 million hectares of land1 throughout temperate, Mediterranean-type and subtropical regions of both the Northern and Southern hemispheres. Although total production of wheat — 681 million tonnes in 2011 (ref. 1) — is slightly lower than that of maize (corn) and rice, wheat is the primary carbohydrate and protein source for the world's population1. For this reason, researchers around the world are tackling the challenge of the plant's genome. On page 705 of this issue, Brenchley et al.2 present a detailed analysis and assembly of wheat gene sequences that will provide a key resource for crop scientists.
Systematic wheat breeding began around 100 years ago, but farmers' improvement of wheat strains by selective breeding can be traced back to the beginnings of agriculture almost 10,000 years ago3. The 'Green Revolution' of the 1960s — a series of advances in agricultural research, technology and infrastructure — triggered a drastic improvement in wheat yields. However, wheat production has struggled to meet global demand, and an increasingly variable and unstable climate is adding to the problems of wheat supply. It has been calculated that wheat production must increase by about 60% by 2050 to meet predicted demand4. This is a daunting challenge, but one that is taken seriously by the international community, as emphasized by the recent decision of the G20 group of countries to establish the international Wheat Initiative, designed to develop resources and capabilities to target wheat improvement5. A key objective of this initiative is to establish genomics resources so that new breeding technologies can be effectively and rapidly applied to wheat (Fig. 1).
The wheat genome is not only large, but also complex. It contains extensive stretches of repetitive non-coding DNA, which make sequence assembly difficult. Where there are genes, it is often hard to differentiate between the three constituent genomes, because each has a related set of genes. And the order of the genes has been partly shuffled on several of the chromosomes, adding to the complexity. There are three strategies that can be adopted to generate and assemble a full sequence for such a problematic genome. One approach is to make clone libraries that contain long stretches of DNA (more than 100,000 bases in each clone) derived from each wheat chromosome arm — there are 21 chromosomes (seven from each genome), giving 42 chromosome arms. The clones can be used to construct an overlapping series of DNA segments to produce a minimum tiling path for sequence assembly. This strategy is feasible because wheat is highly tolerant of chromosome changes and wheat lines are available in which each chromosome arm is present as a separate telosomic chromosome6, which can be readily separated from the rest of the genome. Groups from around the world are working on this project as part of the International Wheat Genome Sequencing Consortium7.
The second approach is to assemble the sequences of the three diploid genomes that are the progenitors of the wheat genome, which is broken down into A, B and D genomes. The progenitor species of the A and D genomes are known to be Triticum urartu and Aegilops tauschii, respectively, and sequence information is available. The progenitor of the B genome is believed to have been a close relative of Aegilops speltoides, and sequence information can be obtained from this species or derived from the tetraploid wheat species Triticum durum, which carries the A and B genomes8.
'Shotgun sequencing' is the third approach. Unlike the large-clone process, which uses sizeable fragments to develop an approximate sequence 'map', the shotgun approach relies on sequence overlap in large numbers of much shorter sequences to assemble a genome. A public resource of shotgun sequences for each chromosomal arm of the wheat genome is close to completion6. However, although sequencing technologies are improving rapidly, such that shotgun sequencing of the entire hexaploid wheat genome is feasible, reliable assembly of these sequences from such a large genome is not yet possible.
The strategy of using large-insert clones to produce chromosome-arm-specific data is expected to yield the best-quality sequence assembly, but this will be a relatively slow process. However, the different approaches are not mutually exclusive and can be combined in a single effort, as Brenchley and colleagues have done. The authors' extensive sequencing led to the identification of between 94,000 and 96,000 genes. They compared these genes with sequence data from the progenitor genomes, and were able to assign around two-thirds of the genes to the A, B or D genomes. The approach was tested using sequences from individual chromosome arms — the researchers used shotgun sequencing of the isolated group 1 chromosomes (1A, 1B and 1D) to develop a set of sequences to 'train' the methods, then used them to assign sequences to specific genomes. The assignment of genes to the A, B or D genome is particularly valuable to wheat researchers because it allows them to differentiate genes and DNA markers from each of the three genomes, a difficult and time-consuming process.
Although Brenchley et al. have provided extensive sequence information, we are still a long way from having a complete wheat-genome assembly. However, the authors' data form a framework to which results from shotgun sequencing of other wheat varieties and from the chromosome-arm sequencing project can be added, and the reliability of the assembly will rise as more groups add their findings.
Brenchley and colleagues' sequence analysis also reveals the extent of wheat-genome flexibility. The researchers find that the formation of a hexaploid genome from three diploid progenitors has led to significant losses of members of many gene families, but also an expansion of other families, including those involved in plant metabolism and growth. These changes are likely to have been a key factor in the success of wheat in so many regions and climatic zones.
So will these findings give us clues to new strategies for wheat improvement? And can we harness the dynamism of the genome to generate varieties better able to cope with a variable environment? Wheat is grown largely in environments in which yield is being undermined by biotic and abiotic stresses. In fact, although wheat yields can exceed 12 tonnes per hectare (ref. 9), the global average is below 3 tonnes per hectare, and in non-irrigated environments the average is below 2 tonnes per hectare1. Heat and drought stress are foreshadowed as the major challenges for future wheat production, and plant breeding offers the best approach for responding to these pressures10. The current and future advances in understanding the wheat genome, and the genomes of other crop plants (Box 1), are likely to hold the key to developing breeding strategies that will optimize yields under variable conditions.
FAOSTAT http://faostat3.fao.org (2012).
Brenchley, R. et al. Nature 491, 705–710 (2012).
Zohary, D. & Hopf, M. Domestication of Plants in the Old World (Oxford Univ. Press, 2000).
Food and Agriculture Organization of the United Nations: Declaration of the World Summit on Food Security (Rome, 16–18 November 2009) (www.fao.org/wsfs/world-summit/en).
Dolezel, J., Kubaláková, M., Paux, E., Bartos, J. & Feuillet, C. Chromosome Res. 15, 51–66 (2007).
Feuillet, C., Langridge, P. & Waugh, R. Trends Genet. 24, 24–32 (2008).
Lobell, D. B., Cassman, K. G. & Field, C. B. Annu. Rev. Environ. Resour. 34, 179–204 (2009).
Tester, M. & Langridge, P. Science 327, 818–822 (2010).
The International Barley Genome Sequencing Consortium Nature 491, 711–716 (2012).
About this article
International Journal of Molecular Sciences (2018)
TaMDAR6acts as a negative regulator of plant cell death and participates indirectly in stomatal regulation during the wheat stripe rust-fungus interaction
Physiologia Plantarum (2016)
Trends in Plant Science (2015)
Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research
BMC Genomics (2015)
Conservative water use under high evaporative demand associated with smaller root metaxylem and limited trans-membrane water transport in wheat
Functional Plant Biology (2014)