Full Paper

Genes and Immunity (2005) 6, 53–65. doi:10.1038/sj.gene.6364149 Published online 16 December 2004

Divergent patterns of linkage disequilibrium and haplotype structure across global populations at the interleukin-13 (IL13) locus

E Tarazona-Santos1,2 and S A Tishkoff1

1Department of Biology, University of Maryland, College Park, MD, USA

Correspondence: Dr SA Tishkoff, Department of Biology, University of Maryland, Building 144, College Park, MD 20742, USA. E-mail: tishkoff@umd.edu

2Current address: Section of Genomic Variation, Pediatric Oncology Branch, National Cancer Institute, National Institutes of Health, 8717 Grovemont Circle, Advanced Technology Center, Room 127, Gaithersburg, MD 20877, USA.

Received 3 June 2004; Revised 8 September 2004; Accepted 8 September 2004; Published online 16 December 2004.

Top

Abstract

Interleukin-13 (IL-13) is a cytokine involved in Th2 immune response, which plays a role in susceptibility to infection by extracellular parasites as well as complex diseases of the immune system such as asthma and allergies. To determine the pattern of genetic diversity at the IL13 gene, we sequenced 3950 bp encompassing the IL13 gene and its promoter in 264 chromosomes from individuals originating from East and West Africa, Europe, China and South America. Thirty-one single-nucleotide polymorphisms (SNPs) arranged in 88 haplotypes were indentified, including the nonsynonymous substitution Arg130Gln in exon 4, which differs in frequency across ethnic groups. We show that genetic diversity and linkage disequilibrium (LD) are not evenly distributed across the gene and that sites in the 5' and 3' regions of the gene show strong differentiation among continental groups. We observe a divergent pattern of haplotype variation and LD across geographic regions and we identify a set of htSNPs that will be useful for functional genetic association studies of complex disease. We use several statistical tests to distinguish the effects of natural selection and demographic history on patterns of genetic diversity at the IL13 locus.

Keywords:

cytokines, haploblocks, natural selection, linkage disequilibrium, population genetics, single-nucleotide polymorphisms (SNPs)

Top

Introduction

Interleukin-13 (IL-13) is a cytokine involved in Th2 immune response. IL-13 plays a role in susceptibility to extracellular parasites, such as helminths, as well as susceptibility to several common complex diseases of the immune system such as asthma and allergies.1 As differential susceptibility to these diseases depends both on environmental and genetic factors,2,3 population genetic studies of cytokine genes associated with Th2 immune response, such as IL13, can contribute towards understanding the genetic basis of susceptibility to infectious disease and differential risk of developing disorders such as asthma and allergies.

IL-13 acts coordinately with the IL-4, IL-5 and IL-9 cytokines to regulate an effective Th2 response (dominated by antibody production) against extracellular pathogens. These cytokines are mapped to chromosome 5q31, forming the so-called 'Th2-cytokine cluster'.4 In the case of infection by helminths and other extracellular parasites, the Th2 response plays an important role in control of the infection and expulsion of parasites. Among the cytokines of the Th2-cluster, only IL-4 and IL-13 are able to initiate the Th2 response in the host. Both IL-13 and IL-4 foster and regulate the production of immunoglobulin E (IgE) by B cells,5 which induces the release of histamines, increasing the permeability of infected areas and allowing other components of the immune system to migrate into the area and to act. Additionally, IL-13 plays a direct role in expression of allergic asthma, resulting from an exacerbated Th2 immune response that becomes pathogenic.4

The IL13 gene encompasses 2938 bp and includes four exons, 56 bp of 5'UTR and 828 bp of 3'UTR (Figure 1). Previous studies6,7,8,9,10,11 and single-nucleotide polymorphism (SNP) discovery initiatives such as the University of Washington-Fred Hutchinson Cancer Research Center Variation Discovery Resource (http://pga.gs.washington.edu/) and SNP500Cancer (http://snp500cancer.nci.nih.go
v/
) have identified 32 SNPs in the IL13 gene and its promoter region in European, European-American, Chinese, Japanese, African American and African samples. These include two SNPs in the promoter region and a single nonsynonymous substitution (2043G>A, Arg130Gln), located in exon 4. Epidemiological studies in different populations have shown an association of the -1111T allele in the promoter region,8,9,12 or of the Gln130 allele,6,7,12 with different asthma-related disorders and proximal risk phenotypes, such as serum IL-13 and IgE levels. Moreover, Hoerauf et al13 have identified an association between the Gln130 allele and sowda, an immunological hyper-reactive form of onchocerciasis.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Genomic structure of the IL13 gene and distribution of SNPs identified in this study. SNPs -1111T>C in the promoter region and the nonsynonymous substitution Arg130Gln (2043G>A), which have shown association with asthma-related disorders in different populations, are denoted in bold characters. The bottom part of the figure shows the SNPs present in each population, the haplotype blocks of linkage disequilibrium and htSNPs identified using the software HaploBlockFinder.

Full figure and legend (98K)

Identification of patterns of linkage disequilibrium (LD) at the genomic level, as well as within specific genes, is useful for mapping genes associated with complex diseases such as asthma, allergies or infectious disease susceptibility.14 Within a region of high LD, only a subset of representative SNPs (haplotype tag SNPs or 'htSNPs') needs to be identified and genotyped for use in association studies.14 The goal of the international HapMap project is to identify htSNPs and to characterize regions of high LD (eg haploblocks) at the genomic level in four global populations, in order to facilitate LD mapping of common diseases.15 However, additional studies in ethnically diverse populations are necessary for at least two reasons. First, the length and boundaries of haploblocks and their htSNPs may differ across human populations, and second, candidate gene association studies require definition of patterns of LD at a finer scale, including specific genes that could be involved in common diseases.14

Recent studies have shown that several genes involved in resistance to infectious disease show a genetic signature of natural selection.16,17,18,19 Thus, cytokine genes such as IL13, which are involved in the regulation of immune response, are likely to be targets of natural selection during human evolutionary history. Identification of specific variants of these genes that are targets of selection could be important for identifying functionally significant polymorphisms. In this study, we resequenced the IL13 gene in 132 individuals from East and West Africa, Europe, China and South America. We have determined the patterns of genetic diversity, haplotype structure and LD at the IL13 locus and have identified haploblocks and htSNPs in globally diverse populations. We discuss how the observed differences may have implications for genetic mapping of complex diseases associated with Th2 immune response. We also assess whether the pattern of genetic diversity at IL13 has been influenced by natural selection.

Top

Results

Patterns of genetic diversity and haplotype structure

Estimates of genetic diversity for the IL13 gene across populations are summarized in Table 1, and for the coding and noncoding regions of the gene in Table 2. We detected 31 SNPs in the 262 chromosomes sequenced (Figure 1), six of which are singletons. The only nonsynonymous substitution we detected was the 2043G>A transition, which results in the Arg130Gln substitution reported by Heinzmann et al.6 The global estimated values of pi (0.0013) and of thetas (0.0012) are very similar to each other and to estimates from other nuclear genes.19 We observed 30 fixed nucleotide differences between humans and chimpanzee. Assuming divergence times between these two species to be 5–7 My and generation times ranging from 20 to 25 years per generation, the estimated mutation rate per site per generation (mu) falls between 1.06 times 10-8 and 1.90 times 10-8. This point estimator of the mutation rate is lower, though not significantly different, than values observed for other autosomal loci.20,21



Among the populations studied, the African populations have higher levels of diversity compared to non-African populations, with West Africans having the highest level of diversity and Europeans having the lowest level of diversity (Tables 1 and 2). Across all the continental samples, the region encompassing intron 3, exon 4 (which contains the nonsynonymous substitution Arg130Gln) and the 3'UTR shows the highest diversity level (Table 2). This contrasts with the very low level of diversity observed in the region encompassing exon 2, intron 2 and exon 3. Although the diversity (pi) in the intron 3/exon 4/3' UTR region differs by more than seven standard deviations (calculated using formula 10.7 in Nei29) from the point estimates of pi in the adjacent exon 2/intron 2/exon 3 region, these two regions do not show significantly different ratios of polymorphisms to fixed differences between humans and chimpanzee, when tested by the HKA (P=0.84) and McDonald sliding window tests30,31 (Table 3).


We observed a total of 88 haplotypes in our global sample. Haplotype frequencies in populations originating from different geographic regions (West Africa, East Africa, Europe, China and South America) are shown in Table 4 and Figure 2, and haplotype diversity is indicated in Table 1. We refer to haplotypes that code for the arginine amino acid and have the ancestral allele 2043G (as determined by comparison to the chimpanzee sequence) as Arg haplotypes, and those haplotypes with the 2043A allele coding for the Glutamine amino acid as Gln haplotypes. The Gln haplotypes are rare in African samples (14%), more common in Eurasian samples (31%) and very common in South American samples (74%, Table 4). The Arg haplotype H49 is the most common in our combined sample and is present in all the geographic regions at frequencies higher than 5%. H49 is the modal haplotype in the East African, European and Chinese samples. Gln haplotype H73 is the only common haplotype shared by all non-African samples but is absent in our African sample. Gln haplotypes H72 and H77, which diverge from the H73 haplotype by 1 and 2 substitutions, respectively, are also common in non-African populations. Moreover, the West African population has the highest frequency of rare haplotypes, followed by East African, Chinese, European and South American populations.

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Haplotype frequencies in different geographic regions. Haplotypes present only once or twice in the samples are pooled and classified as 'Rare'. Haplotypes from H1 to H59 contain the ancestral Arg allele while haplotypes from H60 to H88 contain the derived Gln allele.

Full figure and legend (147K)


Population structure at IL13

Genetic distances among the 11 populations, measured using FST, are shown in Table 5. The Nonmetric Multidimensional Scaling (NM-MDS) representation of the genetic distance matrix, as well as the results of the Analysis of Molecular Variance32 (AMOVA), are shown in Figure 3. It is possible to recognize a pattern of geographic structure of genetic diversity at IL13 that has statistical support, with populations sorting by geographic region. The AMOVA analysis shows a high level of differentiation among the geographic groups (West Africa, East Africa, Europe, China and South America, FCT=0.21, P<0.001) and low level of differentiation among populations within the groups (FSC=0.03, P<0.05). As observed in other studies,19 the East African populations are situated closer to the Eurasian populations in the MDS plot compared to West African populations, and the Chinese population has an intermediate position between Amerindian and European populations (Figure 3). The South Amerindian populations appear to be outliers, likely due to high levels of genetic drift in these small and isolated populations. The among-population component of genetic variance (FST=0.24) is relatively high, nearly twice the mean global FST value for human populations (0.10–0.1233).

Figure 3.
Figure 3 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Multidimensional scaling representation of genetic distances. FST genetic-distance matrix between populations obtained by multidimensional scaling and apportionment of the genetic variance of IL13 obtained by AMOVA. The groups considered for the AMOVA are within the ellipses.

Full figure and legend (21K)


To understand the contribution of individual SNPs to the observed high FCT value, we calculated the global FCT (among-groups component of genetic variance) for each individual SNP. Figure 4 shows that FCT values for SNPs are not evenly distributed along the gene. SNP –646A>G in the 5'UTR and the nonsynonymous SNP 2043 G>A (Arg130Gln) in exon 4, as well as SNPs 2524G>A, 2676G>C, 2705C>T and 2748T>C located in the 3'UTR, show the largest FCT values, with values ranging from 0.27 to 0.37. When South Amerindians, the most differentiated group, were excluded, the FCT values for the entire gene were lower (global FCT=0.14), but the pattern across the gene persisted, with the highest FCT values corresponding to the same SNPs: -646A>G, 2043G>A (Arg130Gln), 2524G>A, 2676G>C, 2705C>T and 2748T>C as in the case of the worldwide sample.

Figure 4.
Figure 4 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Global FCT values (genetic variance among groups) calculated for each SNP across the IL13 gene. The nonsynonymous substitution Arg130Gln is denoted in black.

Full figure and legend (72K)

Patterns of intragenic linkage disequilibrium

The large number of haplotypes observed for the 31 SNPs suggests that recombination has shaped the pattern of haplotype diversity at IL13. The inferred minimum number of recombination events (Rm) indicates the presence of recombinant haplotypes in all the continental samples (Table 1). Moreover, African populations have higher values of the recombination parameter rho=4Ner than non-African populations, which is consistent with the higher theta=4Nemu values observed in African populations (Table 1).

The results of LD analysis are shown in Figure 5. The most striking pattern observed is that African populations show lower LD (ie fewer significant pairwise comparisons and lower values of R2 and Zns, the average of R2 over all pairwise comparisons) than non-African populations. This result is consistent with a number of other studies of LD across geographic regions.19,34,35,36,37 Additionally, the West African and East African populations show divergent patterns of LD (ie different sites are in LD). The Chinese population shows the highest mean level of LD (ZnS=0.27), but the South American population shows the highest proportion of significant values of LD. In non-African populations, the nonsynonymous SNP Arg130Gln (2043 G>A) shows high and significant association with SNPs 2524G>A, 2579A>C and 2748T>C in the 3'UTR of the gene. Moreover, Chinese and South American populations show significant LD between the SNP -1111T>C in the promoter region and SNP 1922T>C in intron 3, 2043G>A in exon 4, and 2524G>A and 2579A>C in the 3'UTR, separated by more than 3000 bp, whereas the European sample does not show high levels of LD between the 5' and 3' region of the gene.

Figure 5.
Figure 5 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Pairwise linkage disequilibrium in West Africa, East Africa, China, Europe and South America. If R2>0.66, squares are represented in dark gray. If 0.33less than or equal toR2less than or equal to0.66, squares are in light gray. R2<0.33 are in white squares. *P<0.05, **significant values after Bonferroni correction. Zn is the average of R2 values.

Full figure and legend (126K)

We identified haploblocks and htSNPs at the IL13 gene in each geographic region (Figure 1). In all cases, the Zhang and Li38 algorithm showed that more than one haploblock could be identified at the IL13 gene (four in West Africa, three in East Africa, two in Europe and China, and three in South America). Moreover, most of the haploblock boundaries are not shared across the five geographic regions. These results, together with the analysis of LD, indicate that haplotype structure and LD are different across the continental groups. Our analysis shows that 3–4 htSNPs would be sufficient to describe haplotype structure in the non-African populations, whereas 6–7 htSNPs would be necessary to characterize haplotype structure in the African populations (Figure 1).

Tests of neutrality

We used several tests of selection based on comparison of the ratio of polymorphisms segregating within humans compared to fixed differences between chimpanzee and human sequences. Comparison of the ratio of polymorphisms to fixed differences at IL13 with a 10 kb noncoding region of chromosome 2239 and intron 44 of the DMD gene40 (both assumed to be evolving under neutrality), using an HKA test, showed no significant difference (P>0.13), indicating that the IL13 gene does not show an excess or depletion of variation compared to these other loci. Additionally, sliding window analysis of differences in the ratio of polymorphism/fixed differences across the gene using several statistics (KR, DKS and Gmean31) indicates no evidence for significant heterogeneity in the pattern of genetic diversity across the IL13 gene (Table 3), as would be expected if selection had acted on particular variants in exons or regulatory regions. Comparison of the ratio of synonymous to nonsynonymous polymorphisms for sites segregating within humans and those fixed between humans and chimpanzee using the Mc Donald–Kreitman41 (MK) test was also not significant (P=1.00).

The observed high level of recombination prevents the construction of an accurate phylogeny of the IL13 gene and the use of population genetics methods that rely on the absence of recombination. However, population genetics methods that are based on the frequency spectrum of SNPs (ie Tajima's D, Fu and Li's D* and F*, and Fay and Wu's H statistics) can be used to make inferences about how selection or demographic forces have shaped patterns of genetic variation at this gene. We used these statistics and coalescent simulations to test neutrality at IL13 under different sets of null hypotheses that include (1) constant population size with different levels of recombination and (2) demographic models of exponential growth at different rates. The results of neutrality tests based on the allelic spectrum are shown in Table 1. Fu and Li's D* and F* statistics do not reject the mutation-drift equilibrium (MDE) model for any of the continental samples, even assuming moderate levels of recombination (rho=0.005 between adjacent sites). The D statistic differs significantly from MDE expectations only for the South American population, when we assume values of the recombination parameter (rho) higher than 0.002 (less than the estimated recombination parameter for that population) and models of at least 100-fold population expansion occurring more than 800 generations ago (a low estimate of growth for human populations) (Table 1). The H statistic, developed by Fay and Wu28, was designed to detect an excess of high-frequency-derived alleles, which is expected from a hitchhiking effect under positive directional selection. Under models of population growth alone, none of the H values were significant for any population. However, the values of H for the East African (H=-4.30) and European (H=-4.70) populations are significant for recombination parameters higher than 0.010 and 0.001, respectively, under a constant-population size model.

Przeworski42 has shown that the H statistic is sensitive to population structure. Therefore, we also tested the significance of the H values against null hypotheses that incorporate different levels of population structure (see Materials and methods for details). Our simulations show that population structure skews the distribution of H towards negative values (although the mean value of H remains nearly the same), and this effect is more accentuated as FST increases (ie number of migrants decreases), in agreement with Przeworski.42 The observed FST for the two European groups considered in this study (North European and Russian) is 0.07 (although not significantly different from zero) and is 0.024 for the two East African groups (Hadza and Maasai). When we consider an FST value as low as 0.01, the 95% confidence interval for H in the Europeans was (-5.90, 3.57), and for East Africans was (-6.23, 4.22) for the observed level of recombination in these populations (Table 1). The observed H values of -4.73 in Europe and -4.30 in East Africa are within these confidence intervals and, therefore, when we consider even low levels of population substructure, we cannot reject the null hypothesis of neutrality.

Top

Discussion

We have characterized nucleotide diversity in both coding and noncoding regions of the IL13 gene and its promoter in African, European, Asian and Amerindian populations. Knowledge of the patterns of genetic diversity and haplotype structure at IL13 across ethnically diverse populations has important implications for identification of SNPs and haplotypes useful for gene-mapping studies of complex diseases such as asthma and allergies, as well as for studies of genetic susceptibility to helminths and other extracellular parasites, which are influenced by IL-13 expression. Additionally, resequencing analysis of the gene in randomly selected individuals from geographically diverse populations enables us to reconstruct the evolutionary history of the IL13 gene and to test for a genetic signature of historical selection.

Divergent patterns of variation in Africans and non-Africans

Patterns of genetic variation and LD in modern populations are affected both by gene-specific factors such as mutation and recombination rates, gene conversion and selection, as well as demographic factors including demographic contractions and expansion and population subdivision.14 Comparisons across geographically diverse populations have shown that patterns of genetic diversity at IL13 are different in African and non-African populations, as expected given their distinct demographic histories.19 As observed for other genomic regions,19,43 African populations have higher levels of genetic diversity (theta=4Nemu) and recombination (rho=4Ner) at IL13 and lower levels of LD than non-African populations. These results imply a larger effective population size (Ne) and a higher number of mutation and recombination events in ancestral African populations. Non-Africans have a subset of the haplotype diversity present in Africa, as observed at several other loci.34,35,36 Levels of LD may be higher and diversity levels lower in non-Africans due to a bottleneck event during the migration of modern humans out of Africa within the past 100 000 years.34,35,36,43

Implications for epidemiological studies

Our analyses confirm that the only nonsynonymous substitution observed worldwide in the IL13 gene is the Arg130Gln SNP at position 2043 in exon 4, which is present in all the continental groups. The frequency of the Arg130Gln SNP differs among continental regions (FCT=0.27, Figure 4). The derived Gln130 allele has consistently been shown to be associated with asthma-related disorders and proximal phenotypes (IL-13 and IgE serum levels) in European and Japanese samples,6,44 as well as with sowda (a hyper-reactive onchoceriasis) in populations from Ghana and Guinea.13 Moreover, molecular modeling suggests that the Arg130Gln amino-acid substitution is functionally important because it affects the binding site of IL-13 with its receptor.6,45 Thus, the relatively high frequency of the Gln130 amino acid in non-Africans and the low frequency of this variant in Africans could have implications for differential susceptibility to disease in populations originating from these regions.

The common -1111T allele in the promoter region has also been associated with asthma, high levels of serum IL-13 and IgE, as well as asthma-proximal phenotypes (bronchial hyper-responsiveness and skin-test responsiveness) in case–control studies with European populations.46,8 The -1111T>C SNP is located in a region containing a binding site of the nuclear factor of activated T cells (NFAT) transcription factor that regulates IL13 and IL4 gene expression. The T allele at position -1111 results in increased binding of the NFAT protein to this region.46 These studies suggest that the -1111T>C SNP is functionally important and could account for the observed association of the -1111T allele and asthma-related phenotypes. Alternatively, the statistical association of the -1111T allele with asthma and related phenotypes could be due to LD with a different causative mutation. In the present study, we have observed significant but low levels of LD (R2<0.66, Figure 5) between the -1111T allele and the 2043A allele (Gln130) in Chinese and South American populations, but no significant LD in the European and African populations. The lack of LD between SNPs -1111T>C and 2043G>A (Arg130Gln) in our European sample is consistent with results of a study in a Dutch population,8 which demonstrated that the association between SNP -1111T>C and asthma-related disorders is not attributable to the 2043G>A SNP (Arg130Gln) and suggests a direct role of the -1111T allele in the promoter region on asthma-related phenotypes in that population. Consistent with the low level, or absence, of LD between the -1111T>C and 2043G>A SNPs across different populations, these SNPS have been identified as htSNPs belonging to different haploblocks (Figure 1) and, thus, both should be included in disease association studies. Moreover, other SNPs in the promoter region, introns, or 3'UTR could also be functionally relevant.47 For example, we observed a high FCT value for the -696A>G SNP upstream of the 5'UTR (an htSNP in the West African population), in which the derived G allele is common only in West Africa (40% frequency) but is very rare or absent elsewhere. Thus, this SNP is a good candidate for future functional and association studies in West African and African American populations.

The divergent pattern of LD (and haploblock structure) across populations in different geographic regions indicates that the haploblock structure of IL13 is sensitive to the demographic history of populations. This is evident even within Africa, where West and East African populations show different sites in LD and different haploblock patterns. This result is consistent with previous studies showing divergent patterns of haplotype variability and LD among sub-Saharan African populations,34,35,36,43,19 likely due to higher levels of population subdivision in Africa. This result has important implications for the HapMap initiative,15 which is based on the assumption that it will be possible to identify a common pattern of haplotype structure (and htSNPs) for use in gene mapping studies across geographic regions. Inclusion of only one or a few African populations in this project may not be adequate for capturing the divergent patterns of LD and haplotype block structure in this geographic region. Our results suggest that studies of association between haplotypes and disease require careful analysis of haplotype structure across ethnically and geographically diverse populations.

Evolutionary inferences and tests of selection

Functional and association studies have shown that the Gln allele is associated with higher levels of IL-13 and a stronger Th2 immune response than the Arg allele.45 Having a strong Th2 immune response could be selectively advantageous or disadvantageous, depending on environmental conditions; a hyperactive Th2 immune response could be advantageous for ridding the body of parasites, but could be disadvantageous by increasing susceptibility to autoimmune disorders, or generating tissue damage in the host.13 In order to distinguish a genetic signature of natural selection and to identify SNPs which are of potential functional significance, we have applied several statistical tests to infer if the pattern of genetic diversity at IL13 is consistent with a neutral model. Across the five populations studied, the region encompassing intron 3, exon 4 and the 3'UTR, which contains the unique nonsynonymous SNP at position 2043 (Arg130Gln), shows the highest level of diversity in all populations (Table 4), which contrasts with the lack of diversity observed in the adjacent region encompassing exon 2, intron 2 and exon 3. To test if natural selection has acted on some regions of the IL13 gene, we used the HKA and the McDonald sliding window tests to compare the ratio of polymorphism to fixed differences between humans and chimpanzees across the IL13 gene. However, as expected under neutrality, we did not obtain significant evidence of departure from homogeneity of the ratio of polymorphisms to fixed differences across the gene.

Signatures of natural selection can also be detected by analysis of FST or FCT values for different SNPs. Individual SNPs that are targets of selection, together with flanking SNPs in tight LD, may show skewed values of FST or FCT48,49 in comparison with FST or FCT values of SNPs which are influenced solely by genetic drift and gene flow. One would expect to observe higher FST or FCT values compared to neutral SNPs if directional or balancing selection results in different alleles maintained at high frequency in different populations. The analysis of genetic structure across geographic regions for specific SNPs shows that FCT values are not evenly distributed across the IL13 gene. The Arg130Gln SNP (2043G>A) in exon 4 and SNPs 2524G>A, 2676G>C, 2705C>T and 2748T>C in the 3'UTR show FCT values of 0.25–0.37 (Figure 4), which are considerably higher than the average FCT value for individual SNPs at the IL13 gene (0.08), and higher than typical values of FST observed for other loci (0.10–0.1233). These SNPs are in LD with each other in non-African populations (Figure 5), which could explain why they share high FCT values. Under the action of genetic drift and gene flow and in the absence of selection, FCT values are expected to be similar across the gene. The fact that the highest values of FCT are concentrated around the only nonsynonymous substitution (Arg 130Gln, 2043G>A) in exon 4 suggests that natural selection may have been responsible for the observed large differences in frequencies of the Gln and Arg haplotypes across continental groups. Likewise, the -646A>G SNP in the upstream region of the gene shows an FCT=0.29, which is considerably higher than the rest of the IL13 gene and other loci and, thus, this SNP may also be a target of selection.

In order to distinguish whether selection has altered the pattern of nucleotide variation at the IL13 gene, we applied several neutrality tests based on the allelic frequency spectrum including Tajima's D, Fu and Li's D* and F* and Fay and Wu's H statistics (Table 1). The statistics D, D* and F* do not differ from expectations under neutrality in most of the samples. Only the South American sample shows a Tajima's D statistic significantly higher than expected under models with recombination (rho>0.002) and exponential population growth (Table 1; see Materials and methods for details). Sakagami et al11 also found positive and significant values of D in Japanese populations, and Fu and Li's F* and D* in European and Japanese populations under models of instantaneous population expansion. While positive values of the D statistic are consistent with models of balancing selection, recent historical bottlenecks and population structure or admixture can also produce positive D values.50 In the case of the South Amerindian sample, there is evidence of admixture with European populations as well as a recent population bottleneck.51 Thus, we cannot easily distinguish between balancing selection or demographic history as an explanation for the observed positive D value.

In contrast to the D, F* and D* tests, the East African and European populations show negative and significant values of the H statistic under models of recombination with parameters similar to those estimated for these populations (Table 1). This result suggests an excess of common derived alleles in these populations, possibly due to selection. Our observation of a significant H value in Europeans is consistent with a study by Zhou et al,10 who observed a significant H value in Europeans and Chinese. Although we do not know the specific selective forces that might be acting on the IL13 gene, examples of geographically circumscribed patterns of natural selection are well known, particularly in regions exposed to distinct environments and infectious diseases.16,52,53 However, our simulations have also demonstrated that even very small levels of population substructure (FST=0.01) can result in a negative skew in the distribution of Fay and Wu's H values, such that we can no longer reject neutrality. Therefore, population structure could be an alternative explanation for the observed significant H values in the current study and in the study of Zhou et al,10 which is consistent with results of simulations by Wakeley and Aliacar,54 and Przeworski42 showing the effects of migration on Fay and Wu's H statistic. Our results suggest that one must consider demographic history, including population expansion and substructure, when testing for signatures of natural selection.

In conclusion, we have determined the haplotype structure of the IL13 gene in ethnically and geographically diverse populations. IL13 shows a pattern of genetic variability characterized by high levels of nucleotide diversity and divergent patterns of haplotype structure and LD across ethnic groups. Thus, haplotype structure should be determined in each ethnic group used for gene mapping studies by association at this locus. While the high FCT observed at particular sites at IL13 is consistent with selection, several tests of neutrality give conflicting results. We cannot reject neutrality based on the HKA, MK tests and several tests based on the allelic distribution (D, D* and F*) in most populations. We can reject a model of neutrality based on Tajima's D in South Amerindians and based on Fay and Wu's H test in Europeans and East Africans, but not under certain models of population structure. However, these tests often lack power, particularly for detecting recent selection events55 and additional data extending across a larger region and the development of more powerful tests of selection may be necessary to distinguish historic selection at this locus.

Top

Materials and methods

Populations sampled

We sequenced 3950 bp encompassing the promoter and coding region of IL13 in 132 individuals belonging to 11 worldwide populations from West Africa (14 Igbo, 14 Yoruba and seven Fulani from Nigeria collected by SAT), East Africa (17 Hadza, 10 Maasai from Tanzania collected by SAT), Europe (10 North Europeans and 10 Russians from the Coriell cell repository), Asia (10 Chinese from the Coriell cell repository) and South America (nine Cayapa from Ecuador, 17 Peruvian Quechua from Tayacaja and 13 from San Martin de Pangoa51,56). All samples were collected with informed consent of the donors and the study was performed under Institutional Review Board approval. We also sequenced the IL13 gene in one chimpanzee from the Coriell cell repository to be used as an outgroup in statistical analyses and to identify ancestral and derived states of polymorphisms.

Sequence analysis

We first amplified a 5.8-kb fragment from genomic DNA by long-range PCR, using Herculase DNA Polymerase (Stratagene). From this PCR product we used nested primers to amplify three overlapping fragments of 1655 bp (IL13PR), 2164 bp (IL13.1) and 1893 bp (IL13.2). These fragments encompass 3950 bp, including the entire IL13 gene and a 1500 bp fragment upstream of the start codon. PCR products were sequenced using nine primers. Sequences and PCR conditions for the primers are available at the Tishkoff laboratory webpage (http://www.life.umd.edu/biolog
y/tishkofflab/
). Sequencing reactions were performed using big dye terminator chemistry (BigDye RR Mix kit from Applied Biosystem). The Phred-Phrap-Consed software57,58 was used to call bases, assemble and edit the sequences, and Polyphred software59 was used to detect polymorphisms and heterozygous sites, which were confirmed by visual inspection. All singletons were confirmed by re-sequencing a new PCR-amplicon obtained from genomic DNA.

Haplotype determination

Haplotypes were inferred by statistical methods in combination with molecular techniques. First, we inferred haplotypes from the multilocus genotypes by the method developed by Stephens et al,60 using the software PHASE 1.0. This method has the following features: (1) it estimates a probability that each phase call is correct, and (2) it enables inferences to be improved by using partial information about chromosomal phase, available from molecular haplotyping. After an initial round of statistical haplotype inferences using PHASE, 20 individuals carrying sites with the lowest probability of correct haplotype call (lower than 0.64) were selected for molecular haplotyping by cloning followed by sequencing of a single clone, which allows determination of the chromosomal phase by comparison with the diploid sequence. The information about these observed haplotypes was then incorporated into the PHASE analysis and an additional round of haplotype inference was performed. Out of the 88 haplotypes identified, 34 were unambiguously phased either because they were observed in a homozygous state, were heterozygous at a single site (18 haplotypes) or were resolved experimentally by molecular haplotyping (16 full haplotypes and four partial haplotypes: H3, H16, H22, H48, Table 4). After statistical and molecular haplotyping, 83% of haplotypes show phase callings with confidence values higher than 0.95.

Estimates of genetic diversity

We computed two different estimators of the parameter theta=4Nemu, where Ne is the effective population size and mu is the mutation rate per site per generation. These estimators were pi, which is the per-site mean number of pairwise differences between sequences,22 and thetas,23 which is based on the number of observed segregating sites (S). Calculations of nucleotide diversity were performed using the software DNAsp 3.98.61 To estimate the mutation rate (mu) per site per generation at the IL13 locus, we divided the number of fixed differences between humans and chimpanzee by twice the divergence time between these species (5–7 Myr) expressed in number of generations (20 and 25 years per generation).62

Analysis of population structure

Genetic distances (FST) between populations were calculated assuming the Tamura–Nei model of nucleotide substitution.63 The genetic distance matrix was graphically summarized by NM-MDS,64 using the software STATISTICA. We used the AMOVA32 to assess the apportionment of genetic variance between the following continental groups of populations: West Africa (Fulane, Igbo and Yoruba), East Africa (Hadza and Maasai), China, Europe (Russia and Northern Europe) and South America (Cayapa and Quechuas from Tayacaja and San Martín de Pangoa). Genetic distances and AMOVA analyses were calculated using the Arlequin 2.0 software.65

Estimates of recombination parameters and LD

The recombination parameter between adjacent sites rho=4Ner (r is the recombination rate between adjacent sites per generation) was estimated by the method of Li and Stephens24 using the software PHASE 2.0.2, and the minimum number of recombination events (Rm), using the method of Hudson and Kaplan.25 LD was measured by R2 66 and its significance was assessed by Fisher exact test, applying the Bonferroni correction for multiple comparisons, using the software DNAsp 3.8. We assessed the overall level of LD by calculating the average of R2 over all pairwise comparisons: ZnS.67 Our analyses include only SNPs with minor allele frequencies higher than 4% that have enough power to detect LD. We also used the approach developed by Zhang and Jin38 to identify blocks of LD (haploblocks) within the IL13 gene and htSNPs that uniquely distinguish common haplotypes. Haploblocks were defined on the basis of LD, imposing a LD threshold of |D'|>0.8.68 Haploblocks and htSNPs were identified using the software HaploBlockFinder, available at http://cgi.uc.edu/cgi-bin/kzha
ng/haploBlockFinder.cgi
.

Tests of neutrality

We used several statistical tests to assess departure from the null hypothesis of MDE (ie neutrality) under the infinite-sites model. Under neutrality, the ratio of polymorphism to fixed differences between humans and chimpanzee should be homogeneous across different genomic regions. We used (1) the McDonald–Kreitman41 test to assess if this ratio is homogeneous between synonymous and nonsynonymous substitutions; (2) the HKA test 30 to assess if the ratio of polymorphism to fixed differences is homogeneous for pairwise comparisons among the four exons of the IL13 gene and across genes and (3) a sliding window approach described by McDonald31 to test for homogeneity of the ratio of polymorphism to fixed differences across the IL13 gene using the following statistics: (a) the number of runs (KR), (b) the Kolmogorov–Smirnov statistic (DKS), and (c) the mean sliding G statistic (Gmean), based on the classical G statistic (ie the log likelihood ratio statistic calculated for a 2 times 2 contingency table). The expectation under a neutral model is that regions with high levels of fixed differences between humans and chimpanzee have less functional constraint and/or high levels of mutation, and are expected to have high levels of SNP diversity among human populations. By comparing the ratio of polymorphism to fixed differences, we can account for differences among regions due to either differences in mutation rate or functional constraint (HKA and sliding windows test). The MK test compares the ratio of synonymous and nonsynonymous substitutions segregating within humans and those fixed between humans and chimpanzee. This test indicates whether or not there is an excess or deficiency of nonsynonymous polymorphism segregating in the population.

We also used several statistical tests of neutrality based on the allelic frequency distribution. Tajima's D27 examines the difference between thetas and pi. Fu and Li's D*26 and F* examines the differences between thetaeta (estimated based on the number of singletons in a sample) and thetas or pi, respectively. Fay and Wu's H28 examines the difference pi-thetaH, where thetaH is an estimator weighted for the presence of high-frequency derived variants. The difference between these estimators of theta should be close to zero under an MDE model and deviations from the expectation under an MDE model can be due either to selection or to demographic history of populations. For example, a significant negative value for D, D* and F* indicates an excess of rare frequency alleles and is consistent with a model of positive directional selection or of population expansion. A significant positive value for D, D* and F* indicates an excess of intermediate frequency alleles and is consistent with a population bottleneck, population substructure or balancing selection. The H test is informative for detecting positive directional selection, although it is also sensitive to the effects of population substructure.42 In the case of D*, F* and H tests, we use the chimpanzee sequence as an outgroup and, thus, we include information about ancestral and derived states of polymorphisms. We tested the significance of these statistics using coalescent simulations conditioning on estimators of the parameter theta (5000 repetitions). For models of constant population size we assumed increasing rates of recombination (rho=0.000, 0.002, 0.004, ..., 0.012, where rho=4Ner is the population recombination parameter between adjacent sites) and we tested the significance of D, D*, F* and H using the software DNAsp (v3.8). For H and D we also tested models of population expansion using the software ms.69 Specifically, we simulated models of exponential growth from 0.0001–1.0% of the current population size (aprox. 6 times 109), that started 200, 400, 800, ..., 4000 generations ago. As pi is an estimator of theta=4Nemu that reflects the effective population size prior to the time of population expansion rather than current theta values, we conditioned simulations of X-fold population expansion based on estimators of theta equal to Xpi. As required by the algorithm used by the ms software, parameters of time (including exponential growth rates) were all measured in units of 4No=Xpi/mu. For H we also simulated a population structure model for which we considered two subpopulations that split T generations ago (T=100, 200, 400, 600, ..., 4000) and then exchange Nm migrants per generation (Nm=0.375, 0.575, 1, 2.25, 4.75 and 24.75). Under the island model at equilibrium, these values correspond to FST values of 0.4, 0.3, 0.2, 0.1, 0.05 and 0.01, respectively.

Top

References

  1. Benjamini E, Coico R & Sunhine G. Immunology. A Short Course Wiley-Liss: New York. 2000;.
  2. Cookson W. Genetics and genomics of asthma and allergic diseases. Immunol Rev 2002; 190: 195−206. | Article | PubMed | ISI | ChemPort |
  3. Quinnell RJ. Genetics of susceptibility to human helminth infection. Int J Parasitol 2003; 33: 1219−1231. | Article | PubMed | ChemPort |
  4. Brombacher F. The role of interleukin-13 in infectious diseases and allergy. Bioessays 2000; 22: 646−656. | Article | PubMed | ISI | ChemPort |
  5. McKenzie AN. Regulation of T helper type 2 cell immunity by interleukin-4 and interleukin-13. Pharmacol Ther 2000; 88: 143−151. | Article | PubMed | ChemPort |
  6. Heinzmann A, Mao XQ & Akaiwa M et al. Genetic variants of IL-13 signalling and human asthma and atopy. Hum Mol Genet 2000; 9: 549−559. | Article | PubMed | ISI | ChemPort |
  7. Graves PE, Kabesch M & Halonen M et al. A cluster of seven tightly linked polymorphisms in the IL-13 gene is associated with total serum IgE levels in three populations of white children. J Allergy Clin Immunol 2000; 105: 506−513. | Article | PubMed | ISI | ChemPort |
  8. Howard TD, Whittaker PA & Zaiman AL et al. Identification and association of polymorphisms in the interleukin-13 gene with asthma and atopy in a Dutch population. Am J Respir Cell Mol Biol 2001; 25: 377−384. | PubMed | ISI | ChemPort |
  9. Howard TD, Koppelman GH & Xu J et al. Gene−gene interaction in asthma: IL4RA and IL-13 in a Dutch population with asthma. Am J Hum Genet 2002; 70: 230−236. | Article | PubMed | ISI | ChemPort |
  10. Zhou G, Zhai Y & Dong X et al. Haplotype structure and evidence for positive selection at the human IL13 locus. Mol Biol Evol 2004; 21: 29−35. | Article | PubMed | ChemPort |
  11. Sakagami T, Witherspoon DJ & Nakajima T et al. Local adaptation and population differentiation at the interleukin 13 and interleukin 4 loci. Genes Immun 2004; 5: 389−397. | Article | PubMed | ChemPort |
  12. Noguchi E, Nukaga-Nishio Y & Jian Z et al. Haplotypes of the 5' region of the IL-4 gene and SNPs in the intergene sequence between the IL-4 and IL-13 genes are associated with atopic asthma. Hum Immunol 2001; 62: 1251−1257. | Article | PubMed | ChemPort |
  13. Hoerauf A, Kruse S & Brattig NW et al. The variant Arg110Gln of human IL-13 is associated with an immunologically hyper-reactive form of onchocerciasis (sowda). Microbes Infect 2002; 4: 37−42. | Article | PubMed | ChemPort |
  14. Tishkoff SA & Verrelli BC. Role of evolutionary history on haplotype block structure in the human genome: implications for disease mapping. Curr Opin Genet Dev 2003; 13: 569−575. | Article | PubMed | ISI | ChemPort |
  15. The International HapMap Consortium. The International HapMap Project. Nature 2003; 426: 789−796. | Article | ISI | ChemPort |
  16. Tishkoff SA, Varkonyi R & Cahinhinan N et al. Haplotype diversity and linkage disequilibrium at human G6PD: recent origin of alleles that confer malarial resistance. Science 2001; 293: 455−462. | Article | PubMed | ISI | ChemPort |
  17. Smirnova I, Hamblin MT, McBride C, Beutler B & Di Rienzo A. Excess of rare amino acid polymorphisms in the Toll-like receptor 4 in humans. Genetics 2001; 158: 1657−1664. | PubMed | ChemPort |
  18. Verrelli BC, McDonald JH & Argyropoulos G. Evidence for balancing selection from nucleotide sequence analyses of human G6PD. Am J Hum Genet 2002; 71: 1112−1128. | Article | PubMed | ISI | ChemPort |
  19. Tishkoff SA & Verrelli BC. Patterns of human genetic diversity: implications for human evolutionary history and disease. Annu Rev Genomics Hum Genet 2003; 4: 293−340. | Article | PubMed | ISI | ChemPort |
  20. Makova KD, Ramsay M, Jenkins T & Li WH. Human DNA sequence variation in a 6.6-kb region containing the melanocortin 1 receptor promoter. Genetics 2001; 158: 1253−1268. | PubMed | ISI | ChemPort |
  21. Yu N, Zhao Z & Fu YX et al. Global patterns of human DNA sequence variation in a 10-kb region on chromosome 1. Mol Biol Evol 2001; 18: 214−222. | PubMed | ChemPort |
  22. Tajima F. Evolutionary relationship of DNA sequences in finite populations. Genetics 1983; 105: 437−460. | PubMed | ISI | ChemPort |
  23. Watterson GA. On the number of segregating sites in genetical models without recombination. Theor Popul Biol 1975; 7: 256−276. | Article | PubMed | ISI | ChemPort |
  24. Li N & Stephens M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 2003; 165: 2213−2233. | PubMed | ISI | ChemPort |
  25. Hudson RR & Kaplan NL. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 1985; 111: 147−164. | PubMed | ISI | ChemPort |
  26. Fu YX & Li WH. Statistical tests of neutrality of mutations. Genetics 1993; 133: 693−709. | PubMed | ISI | ChemPort |
  27. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 1989; 123: 585−595. | PubMed | ISI | ChemPort |
  28. Fay JC & Wu CI. Hitchhiking under positive Darwinian selection. Genetics 2000; 155: 1405−1413. | PubMed | ISI | ChemPort |
  29. Nei M. Molecular Evolutionary Genetics Columbia University Press: New York 1987; p 512.
  30. Hudson RR, Kreitman M & Aguade M. A test of neutral molecular evolution based on nucleotide data. Genetics 1987; 116: 153−159. | PubMed | ISI | ChemPort |
  31. McDonald JH. Improved tests for heterogeneity across a region of DNA sequence in the ratio of polymorphism to divergence. Mol Biol Evol 1998; 15: 377−384. | PubMed | ISI | ChemPort |
  32. Excoffier L, Smouse PE & Quattro JM. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 1992; 131: 479−491. | PubMed | ISI | ChemPort |
  33. Barbujani G, Magagni A, Minch E & Cavalli-Sforza LL. An apportionment of human DNA diversity. Proc Natl Acad Sci USA 1997; 94: 4516−4519. | Article | PubMed | ChemPort |
  34. Tishkoff SA, Dietzsch E & Speed W et al. Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 1996; 271: 1380−1387. | PubMed | ISI | ChemPort |
  35. Tishkoff SA, Goldman A & Calafell F et al. A global haplotype analysis of the myotonic dystrophy locus: implications for the evolution of modern humans and for the origin of myotonic dystrophy mutations. Am J Hum Genet 1998; 62: 1389−1402. | Article | PubMed | ISI | ChemPort |
  36. Tishkoff SA, Pakstis AJ & Stoneking M et al. Short tandem-repeat polymorphism/Alu haplotype variation at the PLAT locus: implications for modern human origins. Am J Hum Genet 2000; 67: 901−925. | Article | PubMed | ISI | ChemPort |
  37. Gabriel SB, Schaffner SF & Nguyen H et al. The structure of haplotype blocks in the human genome. Science 2002; 296: 2225−2229. | Article | PubMed | ISI | ChemPort |
  38. Zhang K & Jin L. HaploBlockFinder: haplotype block analyses. Bioinformatics 2003; 19: 1300−1301. | Article | PubMed | ISI | ChemPort |
  39. Zhao Z, Jin L & Fu YX et al. Worldwide DNA sequence variation in a 10-kilobase noncoding region on human chromosome 22. Proc Natl Acad Sci USA 2000; 97: 11354−11358. | Article | PubMed | ChemPort |
  40. Nachman MW & Crowell SL. Contrasting evolutionary histories of two introns of the duchenne muscular dystrophy gene, Dmd, in humans. Genetics 2000; 155: 1855−1864. | PubMed | ISI | ChemPort |
  41. McDonald JH & Kreitman M. Adaptive protein evolution at the Adh locus in Drosophila. Nature 1991; 351: 652−654. | Article | PubMed | ISI | ChemPort |
  42. Przeworski M. The signature of positive selection at randomly chosen loci. Genetics 2002; 160: 1179−1189. | PubMed | ISI |
  43. Tishkoff SA & Williams SM. Genetic analysis of African populations: human evolution and complex disease. Nat Rev Genet 2002; 3: 611−621. | Article | PubMed | ISI | ChemPort |
  44. Liu X, Beaty TH & Deindl P et al. Associations between total serum IgE levels and the 6 potentially functional variants within the genes IL4, IL13, and IL4RA in German children: the German Multicenter Atopy Study. J Allergy Clin Immunol 2003; 112: 382−388. | Article | PubMed | ChemPort |
  45. Arima K, Umeshita-Suyama R & Sakata Y et al. Upregulation of IL-13 concentration in vivo by the IL13 variant associated with bronchial asthma. J Allergy Clin Immunol 2002; 109: 980−987. | Article | PubMed | ISI | ChemPort |
  46. van der Pouw Kraan TC, van Veen A & Boeije LC et al. An IL-13 promoter polymorphism associated with increased risk of allergic asthma. Genes Immun 1999; 1: 61−65. | Article | PubMed | ChemPort |
  47. Tabor HK, Risch NJ & Myers RM. Opinion: Candidate-gene approaches for studying complex genetic traits: practical considerations. Nat Rev Genet 2002; 3: 391−397. | Article | PubMed | ISI | ChemPort |
  48. Akey JM, Zhang G, Zhang K, Jin L & Shriver MD. Interrogating a high-density SNP map for signatures of natural selection. Genome Res 2002; 12: 1805−1814. | Article | PubMed | ISI | ChemPort |
  49. Rockman MV, Hahn MW, Soranzo N, Goldstein DB & Wray GA. Positive selection on a human-specific transcription factor binding site regulating IL4 expression. Curr Biol 2003; 13: 2118−2123. | Article | PubMed | ISI | ChemPort |
  50. Fu YX. New statistical tests of neutrality for DNA samples from a population. Genetics 1996; 143: 557−570. | PubMed | ChemPort |
  51. Tarazona-Santos E, Carvalho-Silva DR & Pettener D et al. Human evolution in South America is related with environmental and cultural diversity: evidence from Y chromosome. Am J Hum Genet 2001; 68: 1485−1496. | Article | PubMed | ISI | ChemPort |
  52. Haldane JBS. Disease and evolution. Ric Sci 1949; A19 Suppl): 68−76.
  53. Hamblin MT, Thompson EE & Di Rienzo A. Complex signatures of natural selection at the Duffy blood group locus. Am J Hum Genet 2002; 70: 369−383. | Article | PubMed | ISI |
  54. Wakeley J & Aliacar N. Genes genealogies in a metapopulation. Genetics 2001; 159: 893−905. | PubMed | ISI | ChemPort |
  55. Simonsen KL, Churchill GA & Aquadro CF. Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 1995; 141: 413−429. | PubMed | ISI | ChemPort |
  56. Fuselli S, Tarazona-Santos E, Dupanloup I, Soto A, Luiselli D & Pettener D. Mitochondrial DNA diversity in South America and the genetic history of Andean Highlanders. Mol Biol Evol 2003; 20: 1682−1691. | Article | PubMed | ISI | ChemPort |
  57. Ewing B, Hillier L, Wendl MC & Green P. Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome Res 1998; 8: 175−185. | PubMed | ISI | ChemPort |
  58. Gordon D, Abajian C & Green P. Consed: a graphical tool for sequence finishing. Genome Res 1998; 8: 195−202. | PubMed | ISI | ChemPort |
  59. Nickerson DA, Tobe VO & Taylor SL. PolyPhred: automating the detection and genotyping of single nucleotide substitutions using fluorescence-based resequencing. Nucleic Acids Res 1997; 25: 2745−2751. | Article | PubMed | ISI | ChemPort |
  60. Stephens M, Smith NJ & Donnelly P. A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 2001; 68: 978−989. | Article | PubMed | ISI | ChemPort |
  61. Rozas J & Rozas R. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 1999; 15: 174−175. | Article | PubMed | ISI | ChemPort |
  62. Hartl DL & Clark AG. Principles of Population Genetics Sinauer Associates, Inc.: Sunderland, MA 1989; p 682.
  63. Tamura K & Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 1993; 10: 512−526. | PubMed | ISI | ChemPort |
  64. Kruskal J. Nonmetric multidimensional scaling: a numerical method. Psychometrika 1964; 29: 28−42.
  65. Schneider S, Roessli D & Excoffier L. Arlequin Ver. 2.0: A Software for Population Genetics Data Analysis Genetics and Biometry Laboratory, University of Geneva: Switzerland 2000;.
  66. Hill WG & Robertson A. Linkage disequilibrium in finite populations. Theor Appl Genet 1968; 38: 226−231. | Article |
  67. Kelly JK. A test of neutrality based on interlocus associations. Genetics 1997; 146: 1197−1206. | PubMed | ChemPort |
  68. Lewontin RC. The interaction of selection and linkage. I. General considerations: heterotic models. Genetics 1964; 49: 49−67. | ISI |
  69. Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 2002; 18: 337−338. | Article | PubMed | ISI | ChemPort |
Top

Acknowledgements

We thank Brian Verrelli, Joshua Akey, Floyd Reed, Molly Przeworski and Justin Fay for critical discussions and useful suggestions, Neide Silva for clarifying immunological concepts, Richard Hudson for advice about coalescence simulations using ms software, Gianfranco Destefano and Olga Rickards (Tor Vergata, Rome) for the Cayapa samples, and Agnes Awomoyi for assistance with collection of the Nigerian samples. Funded by Burroughs Wellcome Fund and David and Lucile Packard Career Awards and NSF Grant BCS-9905396 to SAT.

Top

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated

NEWS AND VIEWS

Genetics and geography

Nature News and Views (11 Jun 1992)