Frequencies of clinically important CYP2C19 and CYP2D6 alleles are graded across Europe

CYP2C19 and CYP2D6 are important drug-metabolizing enzymes that are involved in the metabolism of around 30% of all medications. Importantly, the corresponding genes are highly polymorphic and these genetic differences contribute to interindividual and interethnic differences in drug pharmacokinetics, response, and toxicity. In this study we systematically analyzed the frequency distribution of clinically relevant CYP2C19 and CYP2D6 alleles across Europe based on data from 82,791 healthy individuals extracted from 79 original publications and, for the first time, provide allele confidence intervals for the general population. We found that frequencies of CYP2D6 gene duplications showed a clear South-East to North-West gradient ranging from <1% in Sweden and Denmark to 6% in Greece and Turkey. In contrast, an inverse distribution was observed for the loss-of-function alleles CYP2D6*4 and CYP2D6*5. Similarly, frequencies of the inactive CYP2C19*2 allele were graded from North-West to South-East Europe. In important contrast to previous work we found that the increased activity allele CYP2C19*17 was most prevalent in Central Europe (25–33%) with lower prevalence in Mediterranean-South Europeans (11–24%). In summary, we provide a detailed European map of common CYP2C19 and CYP2D6 variants and find that frequencies of the most clinically relevant alleles are geographically graded reflective of Europe’s migratory history. These findings emphasize the importance of generating pharmacogenomic data sets with high spatial resolution to improve precision public health across Europe.


Introduction
Interindividual variability in therapeutic drug response can result in adverse drug reactions (ADRs) or lack of efficacy and constitutes a key challenge for health care systems. Notably, 40-70% of patients experience insufficient drug response or drug toxicity and ADRs account for 6.5% of all hospital admissions of which up to 30% are life threatening in at-risk subpopulations [1][2][3][4]. Genetic polymorphisms in drug-metabolizing enzymes, transporters, or drug targets explain around 20-30% to these interindividual differences [5].
Cytochrome P450 (CYP) enzymes constitute a polymorphic superfamily, consisting of 57 functional members in humans [6], that metabolize >80% of all clinically used medications [7]. Among those, CYP2C19 and CYP2D6 are of particular clinical relevance, as they are highly polymorphic and implicated in the metabolism of numerous widely prescribed drugs. CYP2C19 substrates include the tricyclic antidepressants amitriptyline, clomipramine, doxepin and imipramine, the selective serotonin reuptake inhibitors citalopram and sertraline, the antifungal voriconazole, as well as the antiplatelet agent clopidogrel. CYP2C19*2 (rs4244285) is the most common allelic variant in Caucasians and results in aberrant splicing and lossof-enzyme activity [8]. In contrast, the regulatory polymorphism rs12248560 defining CYP2C19*17 increases transcriptional activity and causes the ultrarapid CYP2C19 metabolism [9].
CYP2D6 metabolizes around 25% of currently prescribed drugs, including various antidepressants, neuroleptics, betablockers, opioids, antiemetics, and antiarrhythmics. Of the more than 100 allelic variants for CYP2D6 that have been described so far, CYP2D6*4 (rs3892097) is the most prevalent loss-of-function allele in Caucasian individuals. Furthermore, CYP2D6 harbors functionally relevant copy number variations (CNVs) in which the whole open reading frame is duplicated (e.g., CYP2D6*1×N and CYP2D6*2×N) or deleted (CYP2D6*5), resulting in increased or decreased metabolism of CYP2D6 substrates, respectively.
While frequencies of CYP2C19 and CYP2D6 variations have been extensively studied, these studies were either focused on selected geographical regions or analyzed data aggregated by ethnicity or ancestry [10][11][12]. Therefore, in the present study, we systematically analyzed 79 original publications covering 82,791 healthy volunteers throughout Europe for CYP2C19 and CYP2D6 variants to provide a high-resolution map of pharmacogenetically relevant variability across European populations. Analysis of this consolidated data set revealed that the loss-of-function variants CYP2C19*2, CYP2D6*4, and CYP2D6*5 were graded from Northern Europe to the Mediterranean, whereas CYP2D6 duplications showed an inverse pattern. Furthermore, in contrast to previous reports we find clear evidence that CYP2C19*17 is most common in Central Europe, whereas prevalence is lower in South Europeans. Combined, these data reveal the extent of intra-European pharmacogenetic variability and underscore the importance of using local genomic information for conducting pharmacogenetic analyzes, clinical trials, and precision public health.

Methods
For the present study we performed a systematic literature survey of the PubMed database covering articles published before December 2018. The search query criteria were (CYP2C19 or CYP2D6) AND (allele OR genotype OR frequency OR prevalence OR polymorphism) AND (European). All studies reporting genotype or allele frequencies of CYP2C19*2 (rs4244285; NC_000010.11: g.94781859 G > A), CYP2C19*17 (rs12248560; NC_000010. 11:g.94761900C > T), CYP2D6*3 (rs35742686; NC_000022. 11:g.42128242delT), CYP2D6*4 (rs3892097; NC_000022. 11:g.42128945C > T), CYP2D6*5 (CYP2D6 gene deletion), or of functional gene duplications (CYP2D6*1×N or CYP2D6*2×N) in healthy individuals of clear geographic origin within a European country were included. Variant positions are provided based on GRCh38. Only original research articles available in English were considered. In addition, we included data from the Genome Aggregation Database [13], the 1000 Genomes Project [14], the SweGen project [15], and the Estonian biobank [16]. As a result, we identified 79 original articles and 82,791 individuals were included in the analysis (Supplementary Tables 1 and 2). For countries for which multiple studies were available, data were aggregated using a weighted average approach using the studies' cohort sizes as weighting factor. For additional information about the haplotypes in question we refer the interested reader to the website of the Pharmacogene Variation Consortium (https://www.pharmvar.org).

Frequencies of important CYP2C19 alleles exhibit large intra-European differences
For CYP2C19 we assessed the prevalence of the loss-offunction allele CYP2C19*2 and the increased function variant CYP2C19*17. In Europe, the frequency of CYP2C19*2 was the highest in Cyprus (21%) and Malta (20%), whereas the lowest prevalence was reported in Czech Republic (8%; Fig. 1; Table 1). Furthermore, frequencies were high in Romani individuals (20.8%). Overall, CYP2C19*2 was slightly more prevalent in Northern and Western European countries, such as Finland (17.5%), the Faroe Islands (18.8%), and France (17.7%), compared with countries on the Mediterranean coast, including Italy (11.8%) and Turkey (11.3%).
On the contrary, CYP2C19*17 was most common in Central Europe with highest frequencies in Slovakia (33%),   Table 1. However, the CYP2C19 genotyping data reported for Slovakia included only 26 subjects and should thus be interpreted with caution [17]. In contrast, frequencies were lower in Southern European countries, such as Spain (17.1%), Greece (18.2%), and Cyprus (11%), as well as Scandinavia (19-22%) and Russia (15%).

Discussion
Interethnic differences in drug pharmacokinetics or dynamics constitute important factors to consider for increasingly multinational drug development programs and genetic variability in drug-metabolizing enzymes constitutes an important factor underlying these differences. As a result, the labels of multiple marketed drugs, including rosuvastatin, carbamazepine, and tacrolimus, contain recommendations to adjust starting doses based on ethnicity [18]. CYP2C19 and CYP2D6 harbor multiple genetic polymorphisms, which differ substantially between ethnic groups and geographic regions and can entail clinically important differences in drug response. To date, numerous studies have analyzed the frequencies of these polymorphisms; yet, the available allele frequency data have, to our knowledge, not yet been systematically consolidated into high-resolution maps of CYP2C19 and CYP2D6 variability within Europe. We therefore compiled data from 79 original publications resulting in aggregated genotypes for the most relevant CYP2C19 and CYP2D6 alleles from 82,791 healthy individuals. Notably, while most studies provided data from unrelated individuals, we cannot exclude relatedness across studies. However, we do not expect this fraction to significantly impact the accuraccy of the reported frequencies.
Frequency of functional CYP2D6 gene duplications was highest in Greece and Turkey and lowest in Scandinavian countries, which is in accordance with decreasing frequencies of ultrarapid metabolizers in a direction from Southern to Northern European populations [19]. Globally, CYP2D6 duplication is most common in North-East Africa and the Middle East with frequencies of 7-16% [20][21][22]. It has been speculated that the evolutionary basis for this gradient is the role of CYP2D6 in the detoxification of plant alkaloids, which allowed carriers of duplicated alleles to tap food sources during times of starvation that would have been toxic for normal CYP2D6 metabolizers [23]. Inversely, frequencies of the loss-of-function alleles CYP2D6*4 and CYP2D6*5 were highest in Scandinavia and lowest on the Mediterranean with further decreasing frequencies in Ethiopia and the Arabian peninsula [20][21][22]. These data thus corroborate the hypothesis that CYP2D6 metabolic capacity might have been under selective pressure specifically in North-East Africa and subsequent migration events resulted in the high frequencies of ultrarapid CYP2D6 metabolizers in Southern Europe.
We observed that CYP2C19*2 was graded from North-West to South-East Europe. Interestingly, we observed a high frequency of CYP2C19*2 in Romani (20.8%) that was significantly different from the hosting Hungarian population (13.3%; p < 0.01; [24]). The Roma minority originates from North-West India, and due to a series of population bottlenecks with multiple founder events and low number of interethnic marriages constitutes a relatively homogeneous ethnic group [25]. As a consequence of this complex population history, CYP2C19*2 frequencies in Roma were similar to those reported in North Indian populations [26]. Thus, pharmacogenetic variability in Roma is distinctly different from European populations and affiliation to a Roma group might be a factor of consideration for treatment decisions of CYP2C19 substrates.
The distribution of CYP2C19*17 was highest in Central Europe and lower in Southern European countries. Our findings are in drastic contrast to a meta-analysis performed by Fricke-Galindo et al. who reported that CYP2C19*17 is predominantly found in Mediterranean countries with frequencies of 42% [11]. However, we find that frequencies are substantially lower throughout Southern Europe, pivoting around 20-25%. Careful revisiting of the original data revealed that instead of the frequency (14.9%), Fricke-Galindo et al. erroneously used the number of individuals (n = 42) for the Spanish population [27] as population frequency. Our findings of moderate CYP2C19*17 frequencies in Southern Europe align with data from Northern African and Middle Eastern populations in which CYP2C19*17 allele frequencies between 17.9% and 26.9% have been reported for Ethiopians, Saudi Arabians, Kurds, and Turks [9,[28][29][30]. Furthermore, low frequencies (15.9%) have been found in Sephardic Jews [31] who originated from Jews on the Iberian peninsula in the 15th century, which are in close agreement with the aggregated prevalence we found in contemporary Spanish individuals (17.1%). The distribution of CYP2C19 alleles thus reflects the migratory history of European populations.
These findings have potentially important implications, as CYP2C19 genotype is included as a pharmacogenomic biomarker in the drug labels of 22 medications. Furthermore, guidelines issued by pharmacogenetics expert workgroups (CPIC and DPWG) provide recommendations to optimize genotype-guided prescription for 14 drugs [32]. For instance, CYP2C19 genotype affects treatment efficacy and risk of adverse events when treated with the antidepressant escitalopram [33], and for ultrarapid CYP2C19 metabolizers it is recommended to select an alternative drug not predominantly metabolized by CYP2C19. As the cost effectiveness of pharmacogenetic implementation is dependent on carrier frequencies, falsely high population frequencies might erroneously incentivize pre-emptive CYP2C19 genotyping. Notably, while genotype data for CYP2C19 and CYP2D6 were available for more than 80,000 individuals from 31 European countries, cohort coverage was geographically highly unequal (Table 1). For eight countries less than 100 individuals were genotyped and, as a result, population frequencies in these countries could only be estimated with wide confidence intervals. Thus, these analyzes incentivize the country-specific expansion of genotype data to further refine estimates of intra-European CYP allele frequencies. Furthermore, while CYP genotype-derived activity scores constitute important proxies for the prediction of metabolic capacity, they can only explain a fraction of the observed functional variability [34]. One underlying reason could be rare variants beyond the tested polymorphisms that contribute to gene function. In this regard CYP2C19 and CYP2D6 have indeed been found to harbor a plethora of rare genetic single nucleotide variants (SNVs) with putative functional importance [35][36][37]. Furthermore, rare population-specific CNVs can contribute to functional variability. For instance, CYP2C19 has recently been found to be deleted specifically in Finns with frequencies of 0.8% [38]. However, information regarding the prevalence of these rare SNVs and CNVs is currently not available with high geographic resolution and the generation of such sequencing-based pharmacogenomic data sets constitutes an interesting avenue for future research that will help to refine genotype-guided drug response predictions [39,40].
In conclusion, we provide refined maps of clinically important CYP2C19 and CYP2D6 genetic variability across European populations. Our findings support the need for refined pharmacogenomic mapping to guide precision public health.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.