Main

In the current post-genomic era, good-quality genome sequences are available for key model organisms and many important pathogens. Furthermore, as genome sequencing continues to become cheaper, draft genomes for thousands of other species are becoming available. This wealth of data has resulted in greater insight into many important issues, but there remains much that we do not understand about these genomes, as even in the most-well-studied organisms a large proportion of genes still lack functional characterization. Understanding the function of these genes is perhaps the most important step in making the most of this data deluge.

The field of functional genomics aims to tackle this problem on the genomic scale, and two recent articles in Genome Research show how genome sequencing can be used as an assay in functional genomics experiments. Thus, advances in high-throughput genome sequencing are not simply flooding us with many unanalysed genome sequences: they can also help us understand which genes are involved in different phenotypic traits. The approach used in the articles is what geneticists call a 'forward' approach: identifying the genes that are involved in a particular biological process, rather than focusing on an individual gene to determine its role in the cell. This approach lends itself particularly well to high-throughput methodologies.

In the first article, the authors aimed to identify genes that are essential to the survival of Trypanosoma brucei, the causative agent of sleeping sickness, using an RNA interference approach followed by high-throughput sequencing1. They transfected trypanosomes with plasmids containing random fragments of the genome that were regulated by an inducible promoter. Subsequent induction of gene expression resulted in degradation of the native transcript through RNA interference. The pool of transfected trypanosomes was then grown for 20 generations. Those parasites that did not express sufficiently high levels of a gene that is required during growth did not survive and, thus, plasmids with inserts relating to essential genes were lost from the pool. The plasmids present in the parasites that grew were then subjected to high-throughput sequencing, and the sequencing reads were mapped back to the reference genome. Those genes that were not represented in the sequencing effort were considered essential for growth. The authors found that 25–37% of T. brucei genes were essential for optimal growth in any particular condition they examined, while a core 10% were essential in all conditions.

In the second article, Parts et al. have updated a classic genetic technique using high-throughput sequencing2. Geneticists have long examined patterns of inheritance to understand the genetic basis of phenotypic traits. The genomic regions that are responsible for a particular trait can be identified on the basis of inheritance of genetic markers that are spaced throughout the genome; offspring that possess a parental trait will contain the associated marker, whereas offspring that do not have the trait will lack the marker. The use of genome sequencing rather than genetic markers allows the identification of specific genes that are responsible for the trait, rather than just a region of the genome that may contain multiple genes. Parts et al. generated a vast pool of offspring from a sexual cross between a heat-tolerant strain and a heat-sensitive strain of budding yeast. The pool of 10 million–100 million progeny, each with a novel combination of parental alleles, was then grown at 40 °C to select the offspring for heat tolerance. DNA was extracted from samples of the stressed pool of yeast at successive time points and subjected to high-throughput sequencing. The quantitative nature of high-throughput sequencing allowed changing allele frequencies in the population to be determined. Significant changes in allele frequency were observed at 21 loci, several times more than the number of loci that were previously identified using traditional linkage methods. These 21 loci included multiple members of the Ras–cyclic AMP signalling pathway, as well as genes that had not previously been implicated in the heat stress response.

These protocols can be adapted to dissect the genetic basis of many traits, including different stress responses and resistance to various drugs, in these and other organisms. For instance, Parts et al. suggest that their method could be used to identify genes that are associated with lifespan by selecting for longer-lived yeast. As sequencing costs fall to within the reach of more and more researchers, we can look forward to the application of high-throughput genome sequencing as a functional genomics assay for many different traits in a wide variety of microorganisms.