Introduction

The natural environments of organisms present a multitude of biotic and abiotic challenges that require both short-term ecological and long-term evolutionary responses. Such responses long have been the subject of biological interest, yet their inherent complexity has made genetic and mechanistic dissection empirically difficult. Recent technical advances in high-throughput sequencing, genotyping and genome-wide expression profiling, coupled with bioinformatics approaches for handling such data, hold great promise for dissecting these responses with unprecedented resolution. The application of new techniques and resources will not be enough; a deeper understanding of these responses will necessarily require a multidisciplinary approach, combining organismal analyses with molecular genetics and genomics, laboratory experiments with field studies and all within an ecologically relevant framework. Such an integration of fields faces many challenges but nonetheless is underway and will revolutionize our understanding of a broad range of biological phenomena.

Ecological and laboratory-based genetic/genomic investigations traditionally have occupied different areas of the biological sciences (Figure 1). With a few notable exceptions, research programs are generally positioned in one domain or the other, but do not regularly cross the boundary that separates these disciplines by utilizing the tools and approaches of both. Ecological genomic studies seek to integrate these disciplines through the use of genomic approaches in an ecological context. For example, how can genomics be used to link population and community responses through organisms at the level of genes and gene expression? Here, we examine this emerging interdisciplinary field that combines ecological and genomic approaches (that is, ecological genomics). By ‘genomic approaches’, we refer to any genome-enabled approach, whether aimed at discovering the ecological functions of single or multiple genes. We define ecological genomics as an integrative field of study that seeks to understand the genetic mechanisms underlying responses of organisms to their natural environment. These responses include modifications of biochemical, physiological, morphological or behavioral traits of adaptive significance. Our focus here is not, however, to detail many new and powerful genetic and genomic techniques currently available to evolutionary and ecological functional genomics; that has been carried out elsewhere (Gibson and Muse, 2004; Thomas and Klaper, 2004; Vasemagi and Primmer, 2005). Instead, our aim is to focus on why such a combined approach is valuable and to highlight the insights that can be gained.

Figure 1
figure 1

Conceptual framework for Ecological Genomics. The top half of the figure depicts the interactions among levels of biological organization that are traditionally the subject of ecology. The black arrows indicate ecological interactions between the organism, the population and community levels and the ecosystem, with the idea being that properties of organisms affect the make-up and functions of the other levels and vice versa. The bottom half of the figure depicts interactions among levels of biological organization that are traditionally the subject of laboratory-based, genetic, cellular or physiological studies. Here, the black arrows also indicate the interactions between the levels, with organismal responses affecting and being affected by its genotype, which in turn affects what genes are expressed and at what levels, which in turn has effects on the phenotype of the organism, ultimately leading to its overall response. Ecological genomic studies seek to integrate these disciplines (orange arrows) through the use of functional genomics approaches.

We expect that one's perception of ecological genomics will depend upon one's scientific background and experiences. If one considers this discipline from a more genetic or genomic perspective, one may wonder how an ecological context could be useful. Conversely, if one considers ecological genomics from purely an ecological perspective, one may ask what additional insights could be gained by understanding the genetic mechanisms that underlie ecological interactions. So we begin with views of ecological genomics from these differing viewpoints.

Why an ecological context?

The diversity of organismal forms, physiologies and evolved responses in nature results from millions of generations of evolution. While much has been learned from bringing organisms into the laboratory to study elements of their biology in isolation, ignoring the ecological context in which these elements arose and persist runs the risk of a suboptimal understanding of particular biological responses and processes.

Ironically, but not unexpectedly, an ecological context has been most lacking for model organisms, the species from which a majority of our current knowledge on genetic mechanisms is based. This deficiency of course makes sense. These species were chosen as models in part because of the ease with which they could be reared away from their natural environments. The lack of a natural ecological context was considered a small price in return for the wealth of genetic information that could be obtained through laboratory-based analyses. While such an approach clearly has merit, a tacit assumption is that loci and genetic pathways identified in the laboratory are likely to be the same as those acting in natural environments. Recent work casts doubt on whether such assumptions are fully warranted. For example, using a set of Arabidopsis thaliana recombinant inbred lines, Weinig et al. (2002) mapped quantitative trait loci (QTL) for flowering time, one of the genetically best-understood traits in plants, under geographically and climatically diverse field conditions, as well as in highly controlled growth chamber conditions. While some QTLs were detected in all environments, others were detected only in subsets of environments, such as in field and growth chamber environments that shared a similar photoperiod. Still others were detected only under natural field conditions. Such findings illustrate the genetic complexity and environmental dependency of this important plant life history trait.

Insights gained through conducting genetic-based experiments in natural environments are not limited to the Arabidopsis examples described above. Several additional studies (Lexer et al., 2003a, 2003b; Carroll et al., 2004; Kessler et al., 2004, 2006; Baldwin et al., 2006) point to the merits of conducting experiments under more natural ecological settings. Another notable example involves the recent explanation of lower than expected frequency for the transmission ratio distorter complex (t complex) in populations of wild mice, Mus musculus. The Mus t complex consists of a 20 cM region on chromosome 17 that encompasses hundreds of genes, including recessive lethal mutations. Recombination throughout this region is suppressed due to four large, non-overlapping inversions, and thus the complex is inherited as a distinct haplotype. The presence of meiotic drive genes linked to the t complex results in transmission distortion at frequencies greater than 90%. Despite these observations, the complex occurs at far lower frequency in natural populations than would be predicted. While an obvious explanation for this difference is a fitness disadvantage for individuals possessing this complex, laboratory-based studies have been unable to document a clear fitness cost. Using semi-natural enclosures, Carroll et al. (2004) examined ecological competition in multiple populations of wild house mice that were polymorphic for the t complex. These studies were conducted over a period of 10 months, which is approximately equal to one generation for this species. In contrast to the equivocal findings of multiple laboratory studies, experiments conducted under semi-natural conditions revealed significant fitness declines in both male and female individuals carrying the t complex. These fitness declines only became evident when normal social and competitive interactions were allowed to occur naturally.

Experiments conducted under more biologically realistic conditions also have provided insights into higher-level ecological interactions. For example, Kessler et al. (2004) created different transformation lines of Nicotiana attenuata (wild tobacco) to silence three genes involved in oxylipin signaling, a pathway involved in plant defense responses to herbivory. Responses of the different disruption lines to attack by a specialist herbivore (the caterpillar Manduca sexta) were evaluated under controlled conditions in the laboratory. These lines then were experimentally planted in the natural environment of N. attenuata to assess responses to natural herbivore communities. While the transformation lines exhibited qualitatively similar results in the laboratory and field with respect to herbivore damage, the field experiments revealed that different herbivore guilds differentially attacked the individual transformation lines, indicating that functional copies of particular plant genes can influence host selection for broadly different categories of herbivores.

The examples presented above represent cases where additional complexity was revealed with respect to previously characterized phenomena. An ecological context may prove equally important for research where virtually no prior knowledge is available, for instance, in determining the roles of genes of unknown function. Recent years have witnessed a flood of new sequence data as entire genomes sequences are being determined for more and more organisms. While functions of many genes may be inferred from sequence homology to genes in other organisms, a majority of predicted genes still have no known function. An ecological context may aid in identifying the roles of such genes, as their current functions may be linked to the ecological and evolutionary history of the organisms in which they reside. For example, the genetic model soil nematode Caenorhabditis elegans is fed Escherichia coli in the laboratory, but encounters, and presumably feeds upon, numerous other soil bacteria in its natural environment. Microarray analyses identified many C. elegans genes that were upregulated in response to growth on the soil bacterium Micrococcus luteus. One of these, pgp-10, encodes a member of the P glycoprotein ATP-binding cassette transporter family, which is involved in multidrug resistance (Sheps et al., 2004). However, as pgp-10 mutants did not display an obvious phenotype, the function of pgp-10 was unknown. When challenged with growth on M. luteus, pgp-10 mutants grew less well than did wild-type C. elegans, indicating that pgp-10 function is required for growth on this soil bacteria (J Coolon and MA Herman, unpublished). Thus, combining ecological with genomic approaches may allow for a more complete analysis of genome function and evolution.

How commonly are such additional insights likely to be revealed? Do the examples described above represent the exceptions, or are such additional levels of complexity likely to be pervasive, and observed whenever efforts are made to consider more fully the interactions that occur at multiple levels in natural systems (Figure 1)? While the jury is still out, we are inclined to think that the latter is true, and that experimental consideration of ecological context is likely to yield considerable additional insights into studies of most biological phenomena.

Why a genetic context in ecology?

Several ecologists have recently argued for a more prominent role of genetic approaches in addressing ecological questions (Wimp et al., 2005; Crutsinger et al., 2006; Johnson et al., 2006; Whitham et al., 2003, 2006). How might a genetic context provide a deeper understanding of pattern and process in more traditional ecological investigations? Molecular and genomic tools recently have provided new insights into several well-studied biological phenomena that historically have occupied the realm of ecology. In some cases, researchers using these techniques have discovered novel organisms and unsuspected biological functions in ecosystems.

By using genomic and molecular approaches, researchers have shed light on the decades-long controversy about the role of allelochemicals (toxins exuded from roots) in controlling competitive interactions (Baldwin, 2003) and recently, invasiveness in plant communities (Bais et al., 2003). Since its accidental introduction from Europe in the late 1800s, spotted knapweed (Centaurea maculosa) has out-competed native plants in numerous rangelands of North America. Bais et al. (2003) used a novel integration of ecological, physiological, biochemical and genomic approaches to investigate the hypothesis that the putative allelochemical (−)-catechin exuded from spotted knapweed roots results in a toxic response in native rangeland (Centaurea diffusa) and model (A. thaliana) plants. To explore possible biochemical mechanisms of (−)-catechin function, the authors examined changes in global patterns of gene expression in a susceptible model plant A. thaliana following exposure to this compound. They suggested that the superior competitive ability of spotted knapweed may result from a release of an allelopathic flavonoid (−)-catechin from roots. This triggers a wave of reactive oxygen species production at the root meristem in nearby susceptible plants, leading to a calcium signaling cascade, which triggers genome-wide changes in gene expression, and ultimately root system death in the effected species. By using genomic and molecular approaches, these researchers have challenged ecologists’ conventional view that toxins, not superior use of resources, may be the mechanism for invasiveness. However, it is not yet clear whether the findings in Arabidopsis can be generalized to responses in native rangeland species.

Another example is the use of genomic tools to further our understanding of mycorrhizal symbiosis (Graham and Miller, 2005), a widespread mutualism between fungi and roots occurring in more than 80% of plant families (Smith and Read, 1997). In spite of their ubiquity and profound ecological importance, gaps remain in our understanding of the genetic, cellular and molecular controls of the establishment of the symbiosis. Liu et al. (2003) used cDNA microarrays to examine a time series of gene expression in mycorrhizal and non-mycorrhizal Medicago trunculatus roots inoculated with the arbuscular mycorrhizal (AM) fungus Glomus versiforme under low and high phosphorus (P) conditions. Among the genes exhibiting changes in expression, one group, associated with defense and stress responses, was upregulated during the initial contact with the fungus and then downregulated as the symbiosis developed. A second group was upregulated in a more sustained fashion and appeared to be correlated temporally with root colonization. These genes appeared to be involved in signaling pathways. Thus, the plant initially reacts in a defensive manner, but following molecular communications with the fungus, the plant reduces its defenses allowing for fungal proliferation within the root. Most genes with increased transcript levels in mycorrhizal roots showed no changes in response to high P, suggesting that alterations in transcript levels were attributable to the AM fungus rather than an indirect effect of improved P nutrition resulting from the symbiosis. Future studies promise to shed light on the poorly understood genetic regulation and molecular communication between host plant and microbial symbiont in the mycorrhizal symbioses, one of the most ancient and arguably, one of the most ecologically important mutualisms.

Finally, a more rigorous genetic approach may help to resolve a current debate among ecologists (Whitham et al., 2003, 2006): how far can genes and genotypes ‘trickle up’ to affect processes at community and ecosystem levels (Figure 1)? Results suggest that genetic differentiation among populations of trees such as Populus (Schweitzer et al., 2004), oak (Madritch and Hunter, 2002) and Metrosideros polymorpha (Treseder and Vitousek, 2001) can influence traits related to nutrient cycling in ecosystems. In these studies, plant genetic variation had strong, and immediate effects on the ecosystem through the tight coupling of litter chemistry to decomposition and nitrogen cycling. Similarly, different Populus hybrids can affect community species assemblages by harboring distinct tree-dwelling communities of arthropods (Wimp et al., 2005). Furthermore, manipulations of plant intraspecific genotypic diversity in the evening primrose (Oenothera biennis) and an old-field goldenrod (Solidago altissima) demonstrated that effects of increased numbers of intraspecific genotypes in experimental field plots cascaded to the community and ecosystem levels: experimental plots with greater numbers of plant genotypes exhibited greater abundance and diversity of plant-dwelling arthropod communities in primrose (Johnson et al., 2006) and goldenrod (Crutsinger et al., 2006) and higher aboveground net primary productivity at the ecosystem level (Crutsinger et al., 2006). These kinds of results will surely give pause to many ecologists interested in species diversity and ecosystem function studies. In summary, although it is not yet clear how common such results will be, the topic of community and ecosystem genetics (Whitham et al., 2006) definitely warrants further attention.

Approaches in ecological genomics

One goal of ecological genomic studies is to understand the genetic mechanisms underlying responses of organisms to their natural environments. This question typically is focused at the level of the organism. Another goal of ecological genomic research is to understand how genomes interact at higher levels of organization, for example, is there a ‘community genome’ and if so, can we understand how it functions. Let us consider these in turn.

When considering the interaction of organisms with their environment, we would like to identify the genes and gene functions that matter most in a given ecological interaction. One approach is to investigate the role(s) of candidate genes whose sequence identity suggests they might be important for an ecologically relevant process or phenotype. For example, Nachman and co-workers investigated the ecologically important trait of coat color in natural populations of rock pocket mice in Arizona living on dark-colored basalt lava and on light-colored rocks. As the genetic control of mammalian coat color has been extensively studied in the laboratory, the authors focused on several candidate loci. Of these, they demonstrated that the adaptive melanism was related to mutations at the melanocortin 1 receptor gene (Nachman et al., 2003). Interestingly, this adaptive melanism appears to have evolved independently in several different populations and that, in spite of similar phenotypes, these changes have a different genetic basis (Hoekstra and Nachman, 2003). This and other examples (Johanson et al., 2000; Stinchcombe et al., 2004) demonstrate the power of a candidate gene approach.

When a candidate gene approach is not feasible, alternative methods must be employed. These alternative methods typically represent a ‘first pass’ at identifying potentially important loci and must be followed up by additional experiments. Transcriptional profiling using microarrays can identify genes whose expression changes in response to environmental perturbations and thus become candidate genes for being involved in the response. This is one of the primary methods currently being used in ecological genomics research to identify important genes. Proteomic methods, such as two-dimensional gel electrophoresis to separate proteins from environmental samples followed by mass spectroscopy to identify them, are now being used to directly determine proteins that are important for specific ecological interactions. Both approaches, however, require functional tests (for example, using mutants) to determine whether or not the identified genes (and proteins) are of functional consequence. A QTL mapping approach that takes advantage of controlled crosses and naturally occurring genetic variation is also a viable strategy. QTL mapping can provide an effective method for localizing the general positions of ecologically and evolutionarily relevant genes through an analysis of their linkage to polymorphic molecular markers in segregating mapping populations. While a popular approach, the confidence limits on QTL positions usually encompass large chromosome regions and hundreds of genes. Further refining the positions of QTLs requires finer-scale mapping and is greatly facilitated if recombination maps and physical maps have been integrated.

Each of the approaches described above benefits extensively from genomic tools currently available only in some organisms. The favorite organisms of many ecological studies may not have these resources available. So, what, if any, compromises should be made? Should the ecology of selected organisms that may not be very representative be studied or should the genomic capabilities of more ecologically interesting taxa be developed? At this stage, both approaches have yielded interesting results (Roberts and Feder, 2000; Weinig et al., 2002; Kessler et al., 2004). Some have taken to using a combined approach, as is being done in the study of the genetic structure of Bochera stricta populations (Song et al., 2006). By taking the best of both worlds, the latter, compromise approach promises to be extremely fruitful. Other compromise approaches involve the use of cross-hybridization of RNAs from one organism to gene chips developed for other, related organisms (Renn et al., 2004). Finally, several ecologically interesting species are now being developed as genetic model systems (Gewin, 2005). Specifically, the genome sequences of the water flea (Daphnia pulex) (Colbourne et al., 2005), the three-spined stickleback (Gasterosteus aculeatus) (Peichel et al., 2001; Colosimo et al., 2005; Peichel, 2005) and the black cottonwood tree (Populus trichocarpa) (Busov et al., 2005; Difazio, 2005) are being or have been determined and relevant genetic tools developed. Although this ‘model vs non-model’ question will continue to be debated in the ecological genomics community, in the end, we expect that no single approach will be the answer for ecological genomics. Instead, the combined studies of model and non-model systems, whether together in the same research program or in separate programs, will continue to yield significant results.

Understanding how genomes interact at higher levels of organization remains a more difficult and challenging task. Metagenomic analyses of microbial communities represent the best and most convincing successes in this area so far (Handelsman, 2004). Metagenomic analysis involves the isolation of DNA from environmental samples, cloning it into large or small insert libraries, sequencing the clones and assembling the representative genomes. Large inserts help to provide a phylogenetic identity for the sequence by including taxon-specific markers such as 16S rRNA genes (reviewed by Allen and Banfield, 2005). This approach has enabled stunning discoveries of new organisms and novel metabolic pathways in the microbial world (DeLong, 2004). Beja et al. (2000, 2001) used such an approach to identify the presence of an unknown metabolic pathway and associated genes in marine bacteria. Photoorganotrophy is a novel pathway that uses proteorhodopsin (membrane protein pigment that functions as a light-driven proton pump) to enable these bacteria to gain energy from the sun when carbon from organic matter is limiting. This unsuspected biological function is of great interest to oceanographers because it fundamentally alters our understanding of how carbon is processed within the surface waters of oceans (Karl, 2002). In yet another example, Venter et al. (2004) conducted a pilot study of microbial metagenome of the Sargasso Sea. The results were stunning: from approximately 1500 l of seawater, they discovered more than a million genes, 70 000 were novel and function a wide range of biogeochemical pathways.

In addition to identification of novel organisms and pathways, metagenomics can also reveal the extent to which species within microbial communities interact as consortia, providing complementary functions. A study of biofilms from an acid mine drainage provides an excellent example (Tyson et al., 2004). Near-complete genomes of the dominant bacteria in this environment were determined from 76 Mb of environmental sequence. Only one of these, Leptospirillium group III, a relatively minor component of the community, contained genes for nitrogen fixation. However, its ability to fix nitrogen in an environment without external nitrogen input made it the keystone species. Additional functional analyses involving community microarrays and proteomics (Ram et al., 2005) are now beginning and are necessary to determine the gene functions used by the community. In addition, the use of functional gene microarrays that assay the presence of genes involved in carbon and nitrogen cycling, for example, will also help to identify important community functions (Schadt et al., 2005). This, as well as the analysis of more complex communities, such as those found in the soil, will be the future of community genomics.

Conclusions and future directions

The aim of ecological genomic studies is to identify the genes and genetic pathways that underlie important ecological responses and interactions, determine the extent to which those genes and pathways exhibit functional variation in nature and characterize the ecological and evolutionary consequences of that variation. Achieving this aim will necessarily require a multidisciplinary approach. Using approaches from disparate areas of biology in the same research program is far from a simple task, however. Not only does it require different areas of experimental expertise, but also a conceptual integration and understanding of mechanisms and interactions at different levels of biological organization.

Currently, work in this area is most feasible in organisms with well-developed genomic resources. The most extensive genomic resources are currently available only in a selected number of model organisms whose ecology is not well studied. Transferring genomic tools from model organisms to close relatives may represent one opportunity to expand the number and diversity of species amenable to this type of research program. Genomic resources are now also being developed for several species with rich histories of ecological investigation; these species will likely emerge as the new ‘models’ for ecological genomics research. Advances in sequencing technology will aid progress in ecological genomics research by allowing genomic tools to be developed for many more species. For example, massively parallel sequencing methods (for example, 454 Life Sciences) may allow many more genomes to be sequenced in a cost-effective manner. We imagine this could become the first step in initiating an ecological genomics research program for many species. From these sequencing efforts microarrays, proteomics and other tools can be developed that can lead to the discovery of candidate genes. However, as we discussed, functional tests of these genes are needed to ultimately determine their importance in any ecological interaction. This is not yet feasible for non-model species and a future challenge would be to develop these methodologies, perhaps using RNA interference, to allow such functional tests to be performed. Alternatively, in some cases other approaches such as functional gene chips may provide sufficient additional insights on community processes without the need for functional tests. So the road ahead will be difficult, but we think the insights into genome function and ecological genetic mechanisms that can be gained will be worthwhile, justifying our increased efforts.