Abstract
Understanding molecular factors determining local adaptation is a key challenge, particularly relevant for plants, which are sessile organisms coping with a continuously fluctuating environment. Here we introduce a rigorous network-based approach for investigating the relation between geographic location of accessions and heterogeneous molecular phenotypes. We demonstrate for Arabidopsis accessions that not only genotypic variability but also flowering and metabolic phenotypes show a robust pattern of isolation-by-distance. Our approach opens new avenues to investigate relations between geographic origin and heterogeneous molecular phenotypes, like metabolite profiles, which can easily be obtained in species where genome data is not yet available.
Similar content being viewed by others
Introduction
The naturally occurring accessions of Arabidopsis thaliana (Arabidopsis) are found across continents and have adapted to various growth habitats1,2,3. This together with their known genetic basis and geographic origin has led not only to the identification of ecologically relevant traits4,5,6,7, but also global patterns of genetic diversity8,9,10 and their relation to climate11,12. Recently it was discovered that genetically similar Arabidopsis accessions derive from geographically more closely related locations, suggesting a robust pattern of isolation by distance13,14,15. However, these findings were obtained by relating the genetic and geographic distances either across all accession pairs or via parameter-dependent neighbourhood structures, and without correcting for climate effects. They were also restricted to genotypic variation, whereas selection will act on phenotypic traits.
Here we provide a parameter-free network-based approach for mapping heterogeneous molecular phenotypes on networks constructed from geographic location data. We use this approach in combination with corrections for climate effects to demonstrate that not only genotypic variability, but also flowering and metabolic phenotypes robustly relate to geographic origin of Arabidopsis accessions as predicted by the isolation-by-distance model.
Results
Phenotypic and genotypic data sets
Metabolic profiles contain information about the levels of large numbers of metabolites. They provide an integrative phenotype that has already been shown to be predictive of biomass yield16,17,18, heterosis19 and, to a lesser extent, abiotic stress tolerance20, as well as to be indicative of wine quality from different sites21,22,23. We analysed the levels of 49 metabolites in 92 diverse accessions, including lines with different growth habitats and geographic origins (Supplementary Table S1). The majority of the analysed accessions come from Eurasia, together with a few accessions from North America and Africa. The accessions were grown ex situ under standardized irradiance, photoperiod and temperature. Metabolic profiles were determined in a 12-h light/12-h dark photoperiod at two levels of nitrogen fertilization: one allowing close to maximal growth (OpN) and another limiting growth (LiN)24, as well as in a 8-h light/16-h dark photoperiod with high nitrogen supply when growth is limited by carbon (LiC)25. Carbon is the major component of plant biomass, and short photoperiods lead to a coordinated decrease in metabolism and growth to maintain a balance between photosynthetic assimilation, storage and use of carbon26,27. Nitrogen is often a limiting nutrient of plant growth, and the molecular basis for its assimilation by plants is well-established28. Uptake and remobilization of nitrogen have been investigated in a small number of Arabidopsis accessions29,30,31,32,33, but the extent to which variation in metabolic processes reflect adaptations to specific environments and how this variation is maintained with regard to geographic proximity and climate remain elusive. As further data sets, we used a publically available data set for flowering phenotypes covering 40 of the 92 accessions34, and two independent single-nucleotide polymorphism (SNP) data sets13,15 covering 69 and 80 of the 92 accessions, respectively (Supplementary Table S2).
Analysis based on dense structure
To gain insight into dependence on geographic origin, we first generated distance matrices for genotype and for each phenotype, as well as for geographic locations. The resulting matrices retain information about the relationships between all pairs of accessions, and, thus, are representative of the dense or global structure. The relation between the matrices was examined with the help of the Mantel correlation35 (Supplementary Fig. S1). The analysis indicated that the Mantel correlation between geographic and genomic distance is positive and significant at level 0.05 (Table 1). Analogous analysis of the relation between the difference in flowering phenotypes and geographic distances suggest smaller and non-significant correlation values. For the OpN metabolic phenotype, positive and significant correlation was observed. This indicated that in near-optimal growth conditions, differences in metabolite profiles, like those of SNPs, become larger with increased spatial dispersion, thus hinting at isolation by distance. This relation broke down for metabolite profiles collected in carbon-limited plants and nitrogen-limited plants, for which a non-significant positive relation and a slightly negative relation was found, respectively.
To exclude the effect of climate from these analyses, we calculated the partial Mantel correlation between differences in genotype or phenotypic trait and geographic distances while controlling for the following five climate variables: daily minimum, average, and maximum air temperatures, relative humidity and daylight hours36. When the effect of climate is controlled, the partial Mantel correlation between geographic and genomic distances was positive and smaller than for the full correlation, but not always significant (Table 1). We did not find a significant correlation either between the geographic distances and differences in flowering phenotypes or between geographic distances and differences in OpN, LiN or LiC metabolic phenotypes, although the OpN and LiC remain positive while LiN is negative. This indicates that relationships found in the analysis of the dense structure may be at least partly driven by climatic factors, which will recur at different places on the globe, rather than geographical distance per se.
Sparse network-based approach for local structure
We next investigated whether there is a consistent relation between differences in proximity structure of accessions and genotype or phenotypic traits. Proximity structure captures the sparse or local geographic relations between accessions, and is given by the relative neighbourhood (RN) network37 (Fig. 1a). The RN network provides a well-defined reference for mapping of various phenotypic data. It was generated from bilateral relationships, whereby two accessions are considered neighbours if there is no other accession at a smaller geometric distance. The distance between the phenotype, p, of two adjacent accessions (that is, nodes) was used to calculate the weight of the corresponding edge (Fig. 1b). Each node u is in turn described by , the average of the edge-weights incident on it (Fig. 1c). The entire network G is characterized by , the average of the resulting node-weights. The lower the value of , the more similar the metabolic phenotypes between neighbouring accessions. The salient network properties of the networks resulting from the three conditions are summarized in Supplementary Table S3. We note that with this approach, geographic distances were considered in setting up the RN network, but not in weighting of the nodes and edges. This renders the approach free of subjectively imposed distance cutoffs.
Geographical origin analysis based on sparse structure
The weighted RN network was used to investigate the pattern of local changes in respect to geographic origin. The relationship between proximity structure and genotype or phenotype was explored by using three statistics from classical geographic variability (GV) analysis, namely: Moran’s I38, Geary’s C39, and the Global G40. The first two statistics test the hypothesis that there is spatial relationship between quantities mapped on the network with the null hypothesis of homogeneous spatial distribution. Global G statistic tests whether there are spatial bursts of high (or low) values in an otherwise homogeneous space. All three statistics indicated positive relations of flowering phenotypes and the three metabolic phenotypes with geographic distance (Table 2). However, with these accessions (Supplementary Table S2), we did not observe an isolation-by-distance model for genotypic differences; Moran’s I and Geary's C statistics based on indicated the absence of statistically significant positive relation between genotypic differences of neighbouring accessions (Table 2). These findings suggest that metabolic and flowering phenotypes are likely to show highly convergent local adaptation following the isolation-by-distance model even when neighbouring accessions may exhibit larger genetic variation.
In addition, we considered whether the metabolite profiles might be related to flowering traits, which would mean that these two phenotypes are not truly independent. The plants used for the metabolomics analysis were harvested long before floral induction. Analysis of the correlation structure between the metabolite and flowering phenotypes across 40 accessions (Supplementary Fig. S2, Supplementary Table S4) demonstrated the lack of a consistent relationship across the three conditions. This was further supported by the lack of congruence for pairs of the resulting correlation matrices across conditions, as demonstrated by the RV coefficient (Supplementary Table S5), suggesting a complex interplay between the two phenotypes41.
Taken together, when sparse analysis was used, isolation-by-distance was observed at the level of metabolic and flowering phenotypes but not at the level of genetic variability for the analysed accessions (Supplementary Table S2). The absence of a relationship with genotypic distance apparently contrasts with recent studies, which reported isolation by distance13. Nevertheless, performing the proposed analysis by using the RN network on a larger set of 170 accessions34 indicated that isolation-by-distance model was also confirmed with SNP data (Supplementary Table S6). Moreover, the values for the statistics were in quantitative agreement with those obtained from metabolic and flowering phenotypes (Table 2 and Supplementary Table S6). This raises the question why isolation-by-distance at the level of genetic variability is only revealed when the sparse analysis is performed with a larger number of accessions. As recent studies42,43 have demonstrated that only 9.4–18.5% of SNPs in A. thaliana are functionally relevant, the usage of the whole set of SNPs may introduce artifacts and reduce the robustness of the statistics, particularly pronounced in smaller populations (as demonstrated in the analysis of robustness). Moreover, whole-genome scale SNP variation also includes neutral variation, which may mask the genetic patterns that are solely due to local adaptation, especially with limited number of accessions, whereas metabolic and flowering traits are more likely directly under natural selection.
The proposed mapping of heterogeneous phenotypes on the RN network used in our sparse analysis can reduce bias in examining differences in phenotypes, as it does not consider relations between otherwise unrelated accessions generated from the k nearest neighbours (kNN) of each accession13. In contrast to the kNN network, which may include unilateral relationships and is dependent on the arbitrarily chosen parameter k, the RN network is not only more stringent but also uniquely determined by the locations of the analysed accessions. To emphasize this claim, we compared the results from the RN and kNN network (Supplementary Table S7): examination of the three statistics based on the kNN network demonstrated that their values change drastically with varying k. This implies that a sound conclusion in support of the isolation-by-distance model cannot be readily obtained with the kNN network as there is no objective rule for the selection of a value for k.
Metabolites related to pattern formation
To determine whether a particular metabolite has an effect on the autoregressive model for , we calculated the difference in the Moran’s I statistic from the metabolic phenotypes with and without the metabolite. Metabolites are then ranked based on the z-normalized differences, which separates two classes of opposite effect. The z-scores across all metabolites are presented in Fig. 2. In LiN, carbohydrates and amino acids had opposite effects, with negative values for many carbohydrates like starch, maltose and xylose, and positive effects for central amino acids like glutamine and glutamate, as well as nicotinic acid. The presence of carbohydrates and nitrogen containing metabolites points to metabolism in nitrogen limiting condition as a single yet tightly connected large network44. The pattern was strikingly different for the OpN phenotype, with very strong negative values from the two nitrogen-rich amino acids, glutamine and asparagine, and smaller values from β-alanine and 4-amino-butyric acid, two intermediates in amino acid degradation. In LiC, there is a strong effect for maltose, trehalose, leucine and isoleucine.
Robustness of findings
To investigate the robustness of the statistics from the analyses of dense and sparse structures, we repeated the analysis following exclusion of 5–25% of the analysed accessions. Our findings indicated a general trend that the variability of the statistics on the sparse structure, captured in the RN network together with the proposed mapping of phenotypes, was smaller than the variability of the statistics on the dense structure. In addition, consistently smaller variability was found for the statistics based on the metabolic phenotype than for genomic data, as indicated by the values of the squared coefficient of variation (Table 3). To capture the effects of the sparse proximity structure in combination with climate factors, we also tested a spatial simultaneous autoregressive model for . The spatial parameter is positively significant, with a value of 0.66, 0.81 and 0.75 for the metabolic phenotypes under OpN, LiN and LiC conditions, respectively. None of the other factors significantly influences the regression (Supplementary Table S8).
Discussion
To summarize, our results show that patterns of ecological isolation can be robustly identified with the proposed method for mapping genotypic variation and metabolic and flowering phenotypes on sparse proximity structure. This approach avoids potential inclusion of bias due to heterogeneity of geographic terrain, which often implies usage of air distances and various distance-related cutoffs. Moreover, we demonstrate that the three statistics commonly used in GV analysis reveal the congruence between two very different phenotypic traits: flowering phenotypes and metabolic phenotypes. This opens up the possibility of a research strategy for analysing proximity relations in less well-characterized species for which genome data is not yet available, including closely related species whose genomes are divergent enough to require de novo assembly, but for which metabolic phenotypes would be facile to obtain.
Methods
Distance measures
The different types of data require specific distance measures to investigate how phenotypic and genetic variability relate to geographic origin. To facilitate approximations of Euclidean distances due to Earth curvature, the longitude and latitude are converted from radial units to kilometres by multiplying the given figures with 53 and 69.1 km, respectively. To reduce artifacts, the remaining types of numeric profiles are first z-normalized. Distances between z-normalized numerical profiles are obtained based on the Euclidean metric. Distances between DNA fingerprints13 and SNP data15,34 are determined by a simple count of pair mismatches. While DNA fingerprints warrant the usage of modified scores, following probabilistic treatment of wildcards, for reasons of objective comparison between the two data sets on genetic variability we did not further consider this approach.
Analysis based on dense global structure
To determine how phenotypic variability and genetic diversity relate to geographic location, the distance measures detailed above were applied to each profile type across all pairs of accessions. The resulting distance matrices capturing all-to-all accession differences were analysed by using the Mantel correlation as implemented in the function mantel from the ecodist package in R45 (Supplementary Fig. S1). To exclude the effect of climate, we determined the partial Mantel correlation while controlling for the five climate characteristics enumerated above. The calculations for the partial Mantel correlation were performed by using the same function in R.
Analysis based on sparse local structure
GV analysis seeks to identify patterns of genotypic or phenotypic relatedness dependent on the geographic positions and patterns of dispersal for biological entity of interest. To this end, one or more variables are commonly mapped onto a set of given geographic sites, specified by their respective longitude and latitude, or areal unit centroids (see ref. 46 and references therein). While in the classical GV analyses, these variables may be interval, ordinal, or nominal, with the advances in high-throughput technologies, biological entities are often described by vector profiles including different system level responses (for example, transcriptomic, proteomic, metabolomic) to genetic and/or environmental perturbations.
Many of the techniques from GV analysis require specification of the geographic proximity between the entities which, in turn, can be employed to establish the adjacency relations. The pattern of geographic variation of a variable of interest can then be evaluated with regard to the interconnectedness of the sampling location for which the variable has been measured or observed. To discern such patterns, one usually uses various statistics determining how the variable’s level for each entity is correlated with an appropriately scaled average of the levels from the entity’s neighbours. As the correlation is calculated on the same variable, it is usually referred to as spatial autocorrelation. The correlation can be global, as in the case of the Moran’s I statistics38, which assumes spatial homogeneity, or can take into account local effects, such as the case of the Geary’s C statistics39 and Anselin’s local indicators of spatial association47. Therefore, it is obvious that any analysis of the spatial autocorrelation in the case when each biological entity is described by its location and is attributed a variable in a vector form requires: (i) an appropriate choice of the definition for geographic proximity and (ii) a novel statistical method which can be used in identifying the patterns with such variables.
RN network
Geometric networks provide a formal way to capture the concept of proximity (referred to as neighbourhood) often encountered with geographic locations specified by their longitude and latitude. In geometric networks, the nodes describe the spatial (geographic) locations of given entities (nodes), and two nodes are connected by an edge if a well-defined neighbourhood is empty. The neighbourhood is called empty if and only if no location lies in its interior (except when entire half-space is involved). Let d(x,y) denote the distance between any two nodes x,yS. In all calculations, we consider the Euclidean distance between the two nodes, that is, . In the following, we consider the RN network, whereby two nodes x and y from a given set of nodes S are defined to be adjacent (that is, proximal) if and only if for every zεS, z≠x,y. Note that for a given set of nodes S the so-defined RN network is unique and does not depend on any subjectively imposed thresholds on the underlying distance structure.
Mapping vector profiles on RN network
Each accession is considered as a node, specified by its latitude and longitude. Moreover, to illustrate the method, we consider that each site (accession) xεS is described by its metabolic profile over m metabolites. For the set of nodes, S, containing n given accessions, we first calculate the corresponding RN network based on the available geographic origin information. Given a geometric graph G, we then determine the weight θxy of each edge (x,y)εE(G) as the Euclidean distance between the (z-normalized) metabolic profiles of its incident nodes, that is, . In addition, each accession is characterized by the mean of the weights of its neighbours; in other words, an accession x is assigned a weight such that , where k(x) denotes the degree (number of neighbours) of the node x. Finally, the entire graph G is associated a weight . Any appropriate distance measure, as detailed above, can be used to map different types of profiles on the RN network.
The local weights, establishing the connexion between the profiles of each accession and its immediate geographic neighbourhood, can further be subjected to the classical GV analysis, including the Moran’s I, Geary’s C and the Global G statistics40. Values for Moran’s I closer to 1 indicate positive, while values closer to −1 indicate negative spatial autocorrelation; a value of zero signifies random spatial pattern. The values for Geary’s C lie in the range between 0 and 2. Here a value of 1 indicates random spatial pattern, while values smaller (larger) than 1 indicate negative (positive) spatial autocorrelation. On the other hand, Global G seeks to establish if there are spatial bursts of high (low) values in an otherwise homogeneous space.
To capture the effects of the sparse proximity structure in combination with climate factors, we also tested a spatial simultaneous autoregressive lag model for We used the five climate characteristics: air temperature, daily maximum air temperature, daily minimum air temperature, relative humidity, and daylight hours, as additional variables in the autoregressive model. The spatial autoregressive parameter (rho) was calculated with the trace approximation method48 implemented in the Lagsarlm function from the spdep package in R49.
Statistical sensitivity analysis
In this section we detail the statistical sensitivity analysis, which can be used to determine the metabolites of highest influence to the outcome of GV analysis. The method relies on the proposed θx statistic for each accession and Moran’s I statistic; it consists of the following steps:
-
1
Determine Moran’s I based on the θx statistic over the entire metabolic profile, and call it Iobs
-
2
For every metabolite M
-
3
Determine Moran’s I based on the θx statistic calculated based on the metabolic profile from which the metabolite M is excluded
-
4
Assign the obtained value for I as a weight of the metabolite, and call it IM
-
5
End for
-
6
For every metabolite M
-
7
Calculate the difference ΔM=Iobs−IM
-
8
End for
-
9
Perform a z-transformation on the obtained vector Δ
-
10
Report the metabolites whose z-score is at least half s.d. above/below the mean
Robustness analysis
The findings from the analysis of phenotypic and genetic variability with respect to geography may vary depending on the considered accessions. To establish a quantitative measure for the robustness of the findings from the analyses based on the dense (global), as well as the sparse (local) structure, we first calculated all statistics upon 100 random removals of 5, 10, 15, 20, and 25% of the analysed accessions. As the employed statistics take positive and negative values, we considered the squared coefficient of variation as a quantitative measure for comparison of the robustness from the different analyses and data types50.
Additional information
How to cite this article: Kleessen, S. et al. Structured patterns in geographic variability of metabolic phenotypes in Arabidopsis thaliana. Nat. Commun. 3:1319 doi: 10.1038/ncomms2333 (2012).
References
Trontin C., Tisné S., Bach L., Loudet O. What does Arabidopsis natural variation teach us (and does not teach us) about adaptation in plants? Curr. Opin. Plant. Biol. 14, 225–231 (2011).
Weigel D. Natural variation in Arabidopsis thaliana: from molecular genetics to ecological genomics. Plant Physiol. 158, 2–22 (2011).
Koornneef M., Alonso-Blanco C., Vreugdenhil D. Naturally occurring genetic variation in Arabidopsis thaliana. Annu. Rev. Plant Biol. 55, 141–172 (2004).
Aranzana M. J. et al. Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes. PLoS Genet. 1, e60 (2005).
Banta J. A., Dole J., Cruzan M. B., Pigliucci M. Evidence of local adaptation to coarse-grained environmental variation in Arabidopsis thaliana. Evolution 61, 2419–2432 (2007).
Shindo C., Bernasconi G., Hardtke C. S. Natural genetic variation in Arabidopsis: tools, traits and prospects for evolutionary ecology. Ann. Bot. 99, 1043–1054 (2007).
Bouchabke O. et al. Natural variation in Arabidopsis thaliana as a tool for highlighting differential drought responses. PloS One 3, e1705 (2008).
Nordborg M. et al. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol 3, e196 (2005).
Beck J. B., Schmuths H., Schaal B. A. Native range genetic variation in Arabidopsis thaliana is strongly geographically structured and reflects Pleistocene glacial dynamics. Mol. Ecol. 17, 902–915 (2008).
Picó F. X., Méndez-Vigo B., Martínez-Zapater J. M., Alonso-Blanco C. Natural genetic variation of Arabidopsis thaliana is geographically structured in the Iberian peninsula. Genetics 180, 1009–1021 (2008).
Hancock A. M. et al. Adaptation to climate across the Arabidopsis thaliana genome. Science 334, 83–86 (2011).
Fournier-Level A. et al. A map of local adaptation in Arabidopsis thaliana. Science 334, 86–89 (2011).
Anastasio A. E. et al. Source verification of mis-identified Arabidopsis thaliana accessions. Plant J. 67, 554–566 (2011).
Platt A. et al. The scale of population structure in Arabidopsis thaliana. PLoS Genet. 6, e1000843 (2010).
Horton M. W. et al. Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nat. Genet. 44, 212–216 (2012).
Meyer R. C. et al. The metabolic signature related to high plant growth rate in Arabidopsis thaliana. Proc. Natl Acad. Sci. U.S.A 104, 4759–4764 (2007).
Sulpice R. et al. Starch as a major integrator in the regulation of plant growth. Proc. Natl Acad. Sci. U.S.A 106, 10348–10353 (2009).
Schauer N. et al. Comprehensive metabolic profiling and phenotyping of interspecific introgression lines for tomato improvement. Nat. Biotechnol. 24, 447–454 (2006).
Riedelsheimer C. et al. Genomic and metabolic prediction of complex heterotic traits in hybrid maize. Nat. Genet. 44, 217–220 (2012).
Hirayama T., Shinozaki K. Research on plant abiotic stress responses in the post-genome era: past, present and future. Plant J. 61, 1041–1052 (2010).
Pereira G. E. et al. 1H NMR and chemometrics to characterize mature grape berries in four wine-growing areas in Bordeaux, France.. J. Agric. Food Chem 53, 6382–6389 (2005).
López-Rituerto E. et al. Investigations of La Rioja terroir for wine production using 1H NMR metabolomics. J. Agric. Food Chem. 60, 3452–3461 (2012).
Saurina J. Characterization of wines using compositional profiles and chemometrics. Trend. Analyt. Chem. 29, 234–245 (2010).
Tschoep H. et al. Adjustment of growth and central metabolism to a mild but sustained nitrogen-limitation in Arabidopsis. Plant. Cell. Environ. 32, 300–318 (2009).
Gibon Y. et al. Adjustment of growth, starch turnover, protein content and central metabolism to a decrease of the carbon supply when Arabidopsis is grown in very short photoperiods. Plant. Cell. Environ. 32, 859–874 (2009).
Smith A. M., Stitt M. Coordination of carbon supply and plant growth. Plant, Cell & Environment 30, 1126–1149 (2007).
Stitt M., Zeemann S. Starch turnover: pathways, regulation and role in growth. Curr. Opin. Plant. Biol. 15, 282–292 (2012).
Temple S. J., Vance C. P., Stephen Gantt J. Glutamate synthase and nitrogen assimilation. Trends Plant Sci. 3, 51–56 (1998).
Robinson D. The responses of plants to non-uniform supplies of nutrients. New Phytologist 127, 635–674 (1994).
Forde B., Lorenzo H. The nutritional control of root development. Plant Soil 232, 51–68 (2001).
Walch-Liu P., Forde B. G. Nitrate signalling mediated by the NRT1.1 nitrate transporter antagonises L-glutamate-induced changes in root architecture. Plant J. 54, 820–828 (2008).
Masclaux-Daubresse C. et al. Nitrogen uptake, assimilation and remobilization in plants: challenges for sustainable and productive agriculture. Annals of Botany 105, 1141–1157 (2010).
Ikram S., Bedu M., Daniel-Vedele F., Chaillou S., Chardon F. Natural variation of Arabidopsis response to nitrogen availability. J. Exp. Bot. 63, 91–105 (2012).
Atwell S. et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465, 627–631 (2010).
Mantel N. The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 209–220 (1967).
NASA Surface meteorology and Solar Energy: Global Data Sets. at http://eosweb.larc.nasa.gov/cgi-bin/sse/sse.cgi.
Toussaint G. T. The relative neighbourhood graph of a finite planar set. Pattern Recognit. 12, 261–268 (1980).
Moran P. A. P. Notes on continuous stochastic phenomena. Biometrika 37, 17–23 (1950).
Geary R. C. The contiguity ratio and statistical mapping. The Incorporated Statistician 5, 115–146 (Wiley for the Royal Statistical Society, (1954).
Getis A., Ord J. K. The analysis of spatial association by use of distance statistics. Geogr. Anal. 24, 189–206 (1992).
El-Lithy M. E., Reymond M., Stich B., Koornneef M., Vreugdenhil D. Relation among plant growth, carbohydrates and flowering time in the Arabidopsis Landsberg erecta x Kondara recombinant inbred line population. Plant, Cell & Environ. 33, 1369–1382 (2010).
Cao J. et al. Whole-genome sequencing of multiple Arabidopsis thaliana populations. Nat. Genet. 43, 956–963 (2011).
Clark R. M. et al. Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317, 338–342 (2007).
Sulpice R. et al. Network analysis of enzyme activities and metabolite levels and their relationship to biomass in a large panel of Arabidopsis accessions. Plant Cell 22, 2872–2893 (2010).
Goslee S. C., Urban D. L. The ecodist package for dissimilarity-based analysis of ecological data. J. Stat. Software 22, 1–19 (2007).
Matula D. W., Sokal R. R. Properties of gabriel graphs relevant to geographic variation research and the clustering of points in the plane. Geogr. Anal. 12, 205–222 (1980).
Anselin L. Local Indicators of spatial association-LISA. Geogr. Anal. 27, 93–115 (1995).
Smirnov O. A., Anselin L. An O(N) parallel method of computing the log-jacobian of the variable transformation for models with spatial interaction on a lattice. Comput. Stat. Data Anal. 53, 2980–2988 (2009).
spdep: Spatial dependence: weighting schemes, statistics and models. at http://cran.r-project.org/package=spdep.
Nygård F., Sandström A. Measuring income inequality 406–407Almqvist & Wicksell (1981).
Author information
Authors and Affiliations
Contributions
S.K. and Z.N. designed and implemented the method; A.R.F. and M.S. conceived and designed the experiments; R.S. and C.A. performed the experiments; Z.N., S.K., A.R.F., M.S., R.L. analysed and interpreted results. All the authors discussed the results and wrote the manuscript.
Corresponding author
Supplementary information
Supplementary Information
Supplementary Figures S1-S2 and Supplementary Tables S1-S11. (PDF 1046 kb)
Rights and permissions
About this article
Cite this article
Kleessen, S., Antonio, C., Sulpice, R. et al. Structured patterns in geographic variability of metabolic phenotypes in Arabidopsis thaliana. Nat Commun 3, 1319 (2012). https://doi.org/10.1038/ncomms2333
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/ncomms2333
This article is cited by
-
Integrating molecular markers into metabolic models improves genomic selection for Arabidopsis growth
Nature Communications (2020)
-
Association between vitamin content, plant morphology and geographical origin in a worldwide collection of the orphan crop Gynandropsis gynandra (Cleomaceae)
Planta (2019)
-
Exploring natural variation of photosynthetic, primary metabolism and growth parameters in a large panel of Capsicum chinense accessions
Planta (2015)
-
Metabolic variation between japonica and indica rice cultivars as revealed by non-targeted metabolomics
Scientific Reports (2014)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.