The idea that evolutionary analyses can be a valuable complementary approach for assessing functional importance of genetic variants in not new,1,2,3 but an elegant recent paper is Current Biology, which shows that positive selection might have a significant role in shaping population variation in a gene promoter polymorphism implicated in heart disease, illustrates just how useful these studies can be.4

Inherited differences in DNA sequences contribute to interindividual variability in anthropometric characteristics, risk of disease, and response to the environment and medication. Such variants occur roughly every 300–500 bp throughout the human genome, which has about 3 billion base pairs in total. However, only a small percentage of the estimated 10 million polymorphisms result in differences in protein amino-acid sequence or in the level of gene expression. Moreover, it is likely that only some of these functional variants are relevant to disease risk and other phenotypic traits. Molecular genetic epidemiological analyses and in vitro functional studies are the most common approaches used to assess the significance of particular genetic variants, but this new study illustrates that evolutionary analyses can be similarly effective.

Matthew Rockman and co-workers investigated the evolutionary history of a polymorphism that has been found to be associated with alternative cardiovascular phenotypes in a number of studies. The polymorphism is located in the promoter region of the matrix metalloproteinase-3 (MMP3) gene, with two alleles identified in humans, one having a run of five thymidines (5T) and the other six thymidines (6T), from nucleotide position-1608 relative to the transcriptional initiation site of the gene. Previous studies have indicated that this poly-T track forms part of two overlapping transcription factor-binding sites. Although the two alleles have similar affinity with the transcription factor ZNF148 (also named ZBP89), which acts as a transcription enhancer, the 5T allele has a lower affinity than the 6T allele for another transcription factor (apparently an NF-κB P50/P50 dimer) that acts as a transcription repressor.5 As a result, MMP3 transcript and protein levels in ex vivo tissues are highest in 5T homozygotes, intermediate in heterozygotes, and lowest in 6T homozygotes.6

MMP3 (also known as stromelysin) enzymatically degrades various extracellular matrix proteins in the blood vessel wall and elsewhere. So the expression levels of this gene seem to influence the balance between matrix protein synthesis and degradation, which, in turn, affects the amounts of matrix proteins in tissues. Genetic association studies of the MMP3 gene and atherosclerosis (hardening of the arteries) have shown a genotype–phenotype relationship, such that atheromas (plaques that forms within the walls lining the arteries) in individuals of the 6T/6T genotype tend to be larger, whereas those in individuals of the 5T/6T or 5T/5T genotype are generally smaller but prone to rupture.7 Individuals who carry the 5T allele also seem to have greater arterial elasticity and a predisposition to developing coronary artery aneurysm.8 These observations are consistent with a model in which there is an imbalance favouring matrix protein degradation in 5T allele carriers, whereas in 6T/6T individuals matrix protein accumulation is favoured.

The authors data4 indicate that the poly-T tract mentioned above might lie within a mutational hot spot that has undergone relatively rapid evolution for tens of million years. They compared the human MMP3 gene promoter sequence with those in nine non-human primates, and found that they all contain the poly-T tract but its length differs among the different species. In addition, intraspecies polymorphism was observed in seven of the primate species studied, further supporting that this region is a mutational hot spot.

The study also suggests that natural selection has caused the frequency of the 5T allele in northern Europe to increase. There are considerable differences in 5T allele frequency among human populations: these frequencies range from 0.01 in Cameroon to 0.54 in Sweden. Differences in allele frequency among populations can arise from natural selection and/or genetic drift (change in gene allele frequency due to chance). The former is unique to each locus, whereas the latter affects all autosomal loci equally. To investigate whether the 5T/6T polymorphism has been a target of natural selection, the authors compared the patterns of genetic differentiation at the 5T/6T site among several populations, with reference to patterns at 18 unlinked, neutral polymorphisms that selection is unlikely to affect. The analyses showed significant differences at the 5T/6T site between England and Cameroon and between England and India: a pattern that indicates that positive selection might have increased the 5T allele frequency in northern Europe. In addition, haplotype analyses showed that there was a significant excess of a common 5T haplotype in a European–American sample but not in an African–American sample: further evidence of positive selection for the 5T allele in Europe.

Of course further studies would be desirable to confirm these findings; however, in themselves these results are certainly indicative that the 5T/6T polymorphism has an important functional effect. As coronary heart disease is a late-onset disorder, it is likely that the cardiovascular phenotypes associated with the 5T/6T polymorphism are pleiotropic effects of natural selection on other conditions that are currently unknown. The MMPs have important roles in many physiological processes in reproduction and development as well as the pathogenesis of a number of diseases.9 It is possible that there is an interaction between the MMP3 gene and specific European environmental factors that influences one or more of these physiological or pathological processes, and so confers a selective advantage.

Similar recent studies have detected signature of natural selection at a number of other polymorphic sites,3,10,11,12,13 and we should expect to see many more studies in this area in the near future. Evolutionary analyses like these not only provide insights into the human evolutionary history and the relative contributions of neutral and advantageous mutations to the wealth of polymorphisms in the human genome today, but also help narrow down the sets of genetic variants to be examined in studies that aim to identify causes of disease risk and response to medicine â–ª