Introduction

The enzyme 5,10-methylenetetrahydrofolic acid reductase (MTHFR) catalyzes the conversion of 5,10-methylenetetrahydrofolic acid to 5-methyltetrahydrofolic acid, which is the carbon donor for methylation of homocysteine to methionine via the remethylation pathway. The human gene for enzyme 5, 10-MTHFR has been mapped to chromosomal region 1p36.3 and consists of 17 kb, including 11 exons spanning 2.2 kb. The MTHFR gene C677T polymorphism (MTHFR C677T) causes an amino-acid change from alanine to valine and results in the enzyme becoming thermolabile and half decreased activity.1 This polymorphism is the most common genetic cause of hyperhomocysteinemia,2 which may causally associate with diseases, such as birth defects,3, 4, 5 cardiovascular disease,6, 7 schizophrenia and Alzheimer’s disease.8 Interestingly, this polymorphism is associated with elevated homocysteine levels only in cases of low folic acid intake.2, 9 On the other hand, it has also been suggested that MTHFR 677 T allele may confer survival advantages in population with sufficient dietary folic acid, because it may be protective against diseases, such as colon cancer and acute lymphatic leukemia.10, 11, 12 Yet, the mechanism of protective effects of MTHFR C677T polymorphism is less understood.

The prevalence of MTHFR C677T polymorphism varies among the global population.9, 13 Latitudinal-dependent gradient trend of the T allele frequency in population has been found both in the European population as well as in the native American population. According to our observations, similar trend also existed in the East Asian population. Most researchers attributed this trend to insufficient nutrition intake and absorption, which was decided by regional vegetation and, thereby, climate,9, 14 yet did not provide any proof. We collected polymorphism data from original articles published, and examined its relationship with some environmental factors, especially the earth surface ultraviolet radiation (UV radiation), which we thought might determine the prevalence of MTHFR C677T polymorphism. We assumed that, these factors might serve as natural selection pressure on MTHFR C677T polymorphism through the history of human’s evolution, and determine its global distribution pattern.

Materials and Methods

MTHFR data

We performed a computerized search of the PubMed (http://www.pubmed.com/) and CNKI (http://www.cnki.net/) for relevant articles before the day 31 March 2011. The search key words were used as MTHFR or methylenetetrahydrofolic acid reductase. No language restriction was added.

Within the computerized search results, artificial selection was conducted. The articles that were case–control or cohort designed, carried out in European or Asian or African countries, and provided useful data about MTFHR C677T polymorphism were included in our research. In these included articles, the research subjects should be native residents, and the controls or the cohorts should be apparently healthy people.

The MTHFR C677T polymorphism data was extracted from controls in case–control studies or cohorts in cohort studies. More information was extracted including study design, ethnicity of subjects and locations where the studies were carried out. If two articles described the same original data, the most detailed data was included.

We aggregated MTHFR C677T polymorphism data from studies that were carried out in same location, which we deemed being sampled from the same population. We calculated the T and C allele frequency, and CC, TT and CT genotype frequency with the aggregated data.

The geographic latitude and longitude data of cities was provided by Google map (http://www.google.com/maps) with accuracy of 0.01.

UV radiation data

The UV radiation data was downloaded from the NCAR website (http://cdp.ucar.edu/). The UV radiation data presented climatological distribution of monthly mean surface-level UV radiation, calculated using the tropospheric ultraviolet-visible radioactive transfer model with inputs of ozone column amounts and cloud reflectivity (at 380 nm) measured by satellite instruments (Total Ozeon Mapping Spectrometer, aboard Nimbus-7, Meteor-3 and Earth Probe). The climatology was averaged over the years 1979–2000 for UV-A (315–400 nm), UV-B (280–315 nm) and radiation weighted by the action spectra for the induction for erythema (skin-reddening), pre-vitamin D3 synthesis, and non-melanoma carcinogenesis. Coverage was global, excluding the poles. More details were described in the book ‘UV Radiation in Global Climate Change Measurements, Modeling and Effects on Ecosystems, Chapter 1: A climatology of UV radiation’.15 With the monthly UV radiation data, we calculated annual mean of each type of UV.

The geographical resolution of the measurement was 1.25 longitudes by 1 latitude, so we did approximate treatment on the coordinate data of cities in order to join with UV radiation data.

Climatological data

The climatological data was from the website of the World Metrological Organization (http://www.worldweather.org/), which gathered data from hundreds of climatological stations worldwide. The website provides the monthly climatological data during past 20 years, including monthly mean of daily temperature, daily maximum temperature, daily minimum temperature, total rainfall, rain days and daily sunshine hours. We searched for climatological data for every location included in MTHFR C677T polymorphism data. If there was no climatological station matched with the locations, we searched the geologically nearest station as proxy, which we thought could provide the most approximate climatological data.

With the data provided by the WMO, we calculated annual mean of daily temperature, daily maximum temperature, daily minimum temperature and daily sunshine hours. We got the annual total rainfall and rain days by calculating the sum of the monthly data. We also got the annual mean of daily temperature difference by calculating the mean of difference between daily maximum and minimum temperature.

Statistical analysis

The T allele distribution pattern among the European and Eastern Asian population was demonstrated by cartographic representation. We posited the locations where the studies carried out, with gradient colors representing the T allele frequency of population.

Linear and quadratic regression model was used to exam the relationship between MTHFR C677T polymorphism and UV radiation and other climatological factors. We included T allele frequency, TT genotype frequency, CT genotype frequency and TT+CT genotypes frequency separately as dependent variable in models. All types of UV radiation and climatological factors were included as independent variables separately in models. As the independent factors were closely correlated with each other, single-variable regression model was used. We calculated the square term for every independent variable, which was involved in the quadratic regression model to measure the reversed U-shape curve relationship between MTHFR C677T polymorphism and the environmental factors. All statistical analysis was carried out using SPSS v11.0 software. And the two-side P-value <0.05 was considered statistically significant.

Results

Distribution of MTHFR C677T polymorphism

A total of 205 studies in Chinese language and 169 studies in English language provided qualified data. There are 93 studies in Chinese language and 141 studies in English language with the sample size of control group more than 100. We aggregated data by locations and finally got 154 merged population groups, of which 46 were in China.

The prevalence of MTHFR C677T polymorphism in population was shown in the map (Figure 1). The location dots were gradient colored for T allele frequency in population. Obviously, the population resided at the Apennine peninsula in Europe and at the Bohai bay rim in China have the highest T allele frequency (nearly 50%). While the population resided at regions near the equator have the lowest T allele frequency (less than 10%). The Inuit, who reside in the polar area of North America (not show in the map), have equally lowest T allele frequency (about 6%).9, 13

Figure 1
figure 1

The MTHFR C677T polymorphism distribution in Eurasia. The distribution of MTHFR C677T polymorphism in Eurasia was shown in the map. The population were marked by dots with gradient colored for T allele frequency in population. The colors reflected the frequency of T allele in population, blue is the highest and red is lowest. The gradient colors were separated based on equal amount. That is, in each color interval, there were equal amount of population included.

Latitude-dependent gradient trend of the frequency of the T allele had been found both in the European and East Asian population. Within the same longitude in the Europe, there was lower T allele frequency in northern European population and higher in Mediterranean population, which increased from about 25% to about 50%. Then the T allele frequency decreased in the Northern African population. On the other hand, there was higher T allele frequency in northern Chinese population and lower T allele frequency both in southern Chinese population and Southeast Asian population, which decreased from about 50% to about 10%. The southern Europe was at the same latitude with northern China, and has equally high T allele frequency; the northern Africa was at the same latitude with Southern China, and had equally low T allele frequency. We concatenated the two latitudinal distribution gradients in the West Europe and East Asia to get whole one that the relationship between T allele frequency and latitude was inverse U-shape.

Relatively less obvious was the coastal–inland decreasing trend of T allele frequency, which could be found both in the European and East Asian population. Within the same latitude, there was higher T allele frequency in western European population and lower in eastern European population, which decreased from about 35 to 25%. Similarly, there was higher T allele frequency in coastal Chinese population and lower in inland Chinese population, which decreased from about 50 to 30%.

Correlation and regression

The Figures 2 and 3 showed the scatter plots between the prevalence of MTHFR C677T polymorphism in population and latitude, UV-A radiation and UV-B radiation. The inverse U-shape relationship between T allele frequency and latitude, which was inferred above, was quantitatively examined with the statistically significant R-square of quadratic regression models (Figure 2). Moreover, the relationship between MTHFR C677T polymorphism and UV radiation was also inverse U-shape (Figure 3).

Figure 2
figure 2

The scatter plots between T allele frequency and latitude and UV-A.

Figure 3
figure 3

The scatter plots between MTHFR genotype frequency and UVs.

The results of single-factor quadratic regression models are shown in Tables 1 and 2. The models involved T allele frequency, TT genotype’s frequency, CT genotype frequency and CT+TT genotypes’ frequency separately as dependent variable, and involved latitude, seven climatological factors (annual mean of daily temperature, daily maximum temperature, daily minimum temperature, daily temperature difference, total rainfall, total rain days, daily sunshine hours) and five types of UV radiation (UV-A, UV-B, erythema inducing UV, pre-vitamin D3 synthesis UV and non-melanoma carcinogenesis UV) separately as independent variable.

Table 1 The results of quadratic regression model between MTHFR C677T polymorphism and latitude and climatological variables
Table 2 The results of quadratic regression model between MTHFR C677T polymorphism and UV radiation

As for the independent variables, all five types of UV radiation had better explanatory power than latitude and any climatological factors. UV radiation adjusted by clouds and aerosol had better explanatory power than that without adjustment. Compared with UV-A radiation, UV-B radiation had better explanatory power. As for the dependent variables, the CT+TT genotypes’ frequency could be mostly explained. The CT genotype frequency could be better explained by UV radiation than T allele or TT genotype frequency.

Discussion

As our study has shown, the prevalence of MTHFR 677 T allele, as well as TT and CT genotypes varies among population, following the latitudinal inverse U-shape gradient trend and longitudinal coastal–inland gradient trend, both in the European and Eastern Asian population.9, 13 Behind the visible latitudinal and longitudinal distribution pattern, what in reality is responsible were some latent environmental factors that also followed latitude and longitude, with UV radiation being the most likely candidate.

We could still not deny the dietary nutrition hypothesis that most of researchers believed.13 This hypothesis may be plausible given the confirmed relationship between folic acid and health. However, as we analyzed the dietary habits of population, this hypothesis can never explain that why the Southeast Asian population, who had more green vegetables in diet yet had much lower T allele frequency in comparison with European population. Similarly, the southern Chinese had more green vegetables in diet than the northern Chinese, yet had lower T allele frequency in population. Therefore, there must be some other factors responsible.

Several years ago, Cordain et al.16 first hypothesized that UV radiation might adversely influence folic acid status, and thus influence the world distribution pattern of MTHFR C677T polymorphism.17 They put forward experimental evidence showing that exposure of human plasma in vitro to simulated strong sunlight caused 30–50% loss of folic acid within 60 min. Our results supported their hypothesis and formed a detailed model as follows:

First, people residing at high latitudes had been exposed to less UV radiation. Their less pigmented lighter skin was preferred by evolutionary selection to improve vitamin D status to counter against the risk from vitamin D-related disorders. Yet lighter skin with less melanin could be less protectable against UV photolysis on in vivo folic acid, which might have been inadequate because of inadequate dietary intake. At this condition, CC and CT genotypes, which could use folic acid more efficiently, would convey selective advantage and were more likely to survive. Second, people residing at mid-latitudes had been exposed to moderate intensity of UV radiation yet had more pigmented dark skin, which could be more protective against UV radiation. At this condition, vitamin D was synthesized adequately, and body folic acid was less liable to be photolyzed. Moreover, the favorable climate in the temperate zone provided plentiful vegetable foods. The body folic acid status could be improved by dietary intake. In such favorable environment, T allele could be maximally reserved. Third, people residing at regions near equator had the most pigmented dark skin, and exposed to strong UV radiation. Even the most skin melanin could not be fully protective against UV radiation. Equally unhelpful was their plentiful green vegetable dietary intake. Body folic acid was easy to lose. The CC genotype that could use folic acid with most efficiency would convey selective advantage. Therefore, T allele was mostly eliminated by natural selection (Figure 4).

Figure 4
figure 4

The evolutionary model of natural selection on MTHFR C677T polymorphism.

This hypothesis was supported by our research. In our results, UV radiation was better associated with MTHFR C677T polymorphism than latitudes. Because the earth’s surface UV radiation was determined by more factors other than latitudes, such as atmospheric ozone, clouds and aerosols, altitude and other geographic features, the explanation power of latitude for MTHFR C677T polymorphism was only because of its relationship with UV radiation. Moreover, UV radiation adjusted by clouds and aerosols, which could better represent the earth’s surface, was better correlated with MTHFR C677T polymorphism. This reinforced this relationship between UV radiation and MTHFR C677T polymorphism. Even more interestingly, in all regression models, UV-B is always better associated with MTHFR C677T polymorphism than UV-A, which coincided the fact that UV-B was more harmful and so more attributable as evolutionary natural selection pressure than UV-A.18

Also, none of the climatological factors involved in our analysis manifested better correlation with MTHFR C677T polymorphism than UV radiation. This disproved the dietary nutrition hypothesis that most of researchers believed.13 It is well known that rainfall and temperature were the most important factors to determine regional vegetation, and thus to determine people’s diet. In our results, the annual total rainfall was not significant. Although the annual mean of daily temperature showed better explanatory power. Just like the latitude, its correlation with MTHFR C677T polymorphism was more likely to be confounded by UV radiation, because temperature had the same determining factors as UV radiation, such as latitude and altitude.

The effect of evolution and migration in human’s history to the distribution of MTHFR C677T polymorphism could never be neglected. We compared the distribution of MTHFR C677T polymorphism with the maximum likelihood network that was inferred from the Y-chromosome haplotype frequencies by geographical population group. According to analysis of Y-chromosome haplotype, there were mainly three clades within modern human population: the first clade included the population resided in Pakistan and India, the Central Asia and Siberia, Mideast, Morocco and Europe; the second clade included population resided in New Guinea and Australia, Cambodia and Laos, Japan, Taiwan and China; and the third clade included all Sub-Saharan African population.19 We repeated correlation and regression analysis within these clades, and got consistent conclusions: there was inverse U-shape association between MTFHR C677T polymorphism and UV radiation in both population groups of the Europe-Middle East and the East Asia-Southwest Pacific (Supplementary 1).

Last and most importantly, MTHFR C677T polymorphism was not the only candidate that had been suggested to experience natural selection. Shi et al.20 have showed that there was a tight association between cold winter temperature and p53 Arg72, as well as between low UV intensity and MDM2 SNP309 G/G, in a cohort of 4029 individuals across Eastern Asia. These polymorphisms, including MTHFR C677T in our research, were all striking examples of functional polymorphisms being strongly selected for in human population in response to environmental stresses. To justify these, more sophisticated designed, large sample size-based population genetics study are needed.

Conclusion

Our results supported the hypothesis that the distribution of MTFHR C677T polymorphism in the Eurasia might be the results of genetic and natural selection of environmental factors, especially the UV radiation. As the MTHFR C677T polymorphism has an ancestral origin,14 we believed that the natural selection might work as early as 100 000 years ago, when the climate begin to vastly change. At the same time, prehistoric human began to migrate out of the Africa. As they were exposed to new adverse environments, genetic polymorphism suffered from the pressure of natural selection, thus forming the distribution pattern of MTHFR C677T polymorphism in Eurasia.