Original Research Article | Open | Published:

Prediction of CYP2D6 phenotype from genotype across world populations

Genetics in Medicine volume 19, pages 6976 (2017) | Download Citation

  • A Corrigendum to this article was published on 31 October 2016



Owing to its highly polymorphic nature and major contribution to the metabolism and bioactivation of numerous clinically used drugs, CYP2D6 is one of the most extensively studied drug-metabolizing enzymes and pharmacogenes. CYP2D6 alleles confer no, decreased, normal, or increased activity and cause a wide range of activity among individuals and between populations. However, there is no standard approach to translate diplotypes into predicted phenotype.


We exploited CYP2D6 allele-frequency data that have been compiled for Clinical Pharmacogenetics Implementation Consortium (CPIC) guidelines (>60,000 subjects, 173 reports) in order to estimate genotype-predicted phenotype status across major world populations based on activity score (AS) assignments.


Allele frequencies vary considerably across the major ethnic groups predicting poor metabolizer status (AS = 0) between 0.4 and 5.4% across world populations. The prevalence of genotypic intermediate (AS = 0.5) and normal (AS = 1, 1.5, or 2) metabolizers ranges between 0.4 and 11% and between 67 and 90%, respectively. Finally, 1 to 21% of subjects (AS >2) are predicted to have ultrarapid metabolizer status.


This comprehensive study summarizes allele frequencies, diplotypes, and predicted phenotype across major populations, providing a rich data resource for clinicians and researchers. Challenges of phenotype prediction from genotype data are highlighted and discussed.

Genet Med 19 1, 69–76.


CYP2D6 is a member of the cytochrome P450 gene superfamily, which constitutes many important phase I drug metabolizing enzymes and contributes to the metabolism of up to 25% of clinically used drugs.1,2 As reviewed by He et al.,3 CYP2D6 is the predominant pathway for the elimination or bioactivation of many centrally acting drugs, including tricyclic antidepressants, serotonin-selective reuptake inhibitors, opioids, and antiemetics, but also antiarrythmics, β-blockers, antihistamines, and antiviral agents. In addition, CYP2D6 also plays a role in the metabolism of drugs of abuse and has been shown to bioactivate a number of procarcinogens and neurotoxins.

The CYP2D6 gene is located on Chr22q13.1 in close proximity to two nonfunctional pseudogenes, CYP2D7 and CYP2D8, and has been extensively studied after it was first described in 1989.4 To date, more than 100 allelic variants, not counting subvariants, have been defined by the Human Cytochrome P450 (CYP) Allele Nomenclature Database at http://www.cypalleles.ki.se/.5 Sequence and structural variations give rise to alleles conferring no, decreased, normal, or increased function, leading to a wide range of enzyme activity among individuals and populations.3,6 On one end of the extreme are the so-called poor metabolizers, i.e., individuals with two nonfunctional alleles who are unable to metabolize or bioactivate drugs through the CYP2D6 pathway. On the other end of the extreme are ultrarapid metabolizers who carry at least one increased function allele (i.e., two or more copies of a functional allele on one chromosome) in addition to a normal-function allele. These two metabolizer groups are at the highest risk for experiencing dose-related adverse events or treatment failure, depending on the particular substrate involved.

CYP2D6 allele frequencies have been shown to vary substantially among world populations.7,8 Some allelic variants are present across populations at similar frequencies, whereas others are observed at vastly different frequencies or have only been detected in a certain ethnic group. Polymorphic expression of CYP2D6 is the most important single factor impacting variable activity within and among populations.

The Clinical Pharmacogenetics Implementation Consortium (CPIC) develops guidelines that enable the translation of genetic laboratory test results into actionable prescribing decisions for specific drugs (https://www.pharmgkb.org/page/cpic and https://cpicpgx.org).9 Since its inception in 2009, 22 guidelines and updates have been published, including three involving CYP2D6 (CYP2D6/codeine;10,11 CYP2D6 and CYP2C19/tricyclic antidepressants;12 and CYP2D6 and CYP2C19/ selective serotonin reuptake inhibitors 13). In the supplemental materials of the CYP2D6 guidelines, allele frequencies across populations have been systematically captured and updated with each new guideline or update. To our knowledge, this is the most comprehensive and up-to-date resource for CYP2D6 allele-frequency information. It is freely available through the PharmGKB at http://www.pharmgkb.org/download.action?filename=CYP2D6_Frequency_Table_and_Legend_R3.pdf.

For this report, we have updated the allele frequency in this table to include the most recent publications describing allele frequencies, calculated diplotype frequencies for each study, and translated diplotypes into phenotype based on activity score (AS).14 The prevalence of genetically predicted phenotypes, referred to as genetic poor, intermediate, normal, and ultrarapid metabolizers (gPM, gIM, gNM, and gUM, respectively) are shown for each study and major populations.

Materials and Methods

Literature review and allele-frequency tabulation

The PubMed database (1995–2015) was searched using the following criteria: (CYP2D6 or 2D6 or cytochrome P4502D6) AND (genotype OR allele OR frequency OR minor allele OR variant OR ethnic OR race OR racial OR ethnicity) with filter limits set to retrieve “full -text” and “English” literature. Reports were also identified from citations by others or review articles. Studies were included if (i) the ethnicity of the population was clearly indicated, (ii) either allele frequencies or genotype frequencies were reported, (iii) the method by which the genes were genotyped was indicated, (iv) the sample population consisted of ≥50 individuals with a few exceptions (e.g., smaller cohorts that were part of larger studies), and (v) the study represented an original publication. A few studies were excluded owing to apparent technical shortcomings or flaws in the interpretation of single-nucleotide polymorphism (SNP) data for allele assignment. For all included studies, the first author, year of publication, and PubMed identification or doi number and allele frequencies are provided in Supplementary Table S1 online.

Reports were organized by worldwide race/ethnic designations according to the Human Genome Diversity Project–Centre d’Etude du Polymorphisme Humain15,16 (Africa, Americas, East Asia, Europe, Middle East, Oceania, and South Central Asia), with the exception that African Americans were listed separately from Africans. Additional population information (e.g., Caucasian, Chinese, Spanish, geographic region, and city of enrollment) and study subjects’ health status (i.e., whether they were healthy or patients) were also captured. A report on an Ashkenazi Jewish population is shown separately. Within major ethnicities, reports were sorted by “population” and year of publication.

Nomenclature and terms

Star (*) allele designations were assigned according to the Human Cytochrome P450 (CYP) Allele Nomenclature Database at http://www.cypalleles.ki.se/5 and alleles were tabulated as designated in each report; for sequencing studies, alleles not listed as being present were assigned a frequency of 0%. Allele frequencies were determined by counts. For each major ethnicity, the average frequency and range (minimum and maximum) was determined for each allele (Supplementary Table S1 online, yellow shading). The frequency of the CYP2D6*1 reference allele was calculated as 100 minus the sum of variants for a subset of reports that describe allele frequencies for the most common alleles found in a particular ethnic group (Supplementary Table S1 online, blue shading). Studies not describing *2 or few allelic variants (e.g., *4 and *5 only or *10 only) were omitted to maximize accuracy of CYP2D6*1 estimation.

To describe functionality of alleles and phenotype status, we adhered to the terms “no,” “decreased,” “normal,” and “increased” function alleles and poor (PM), intermediate (IM), normal (NM), and ultrarapid (UM) metabolizers, respectively. These terms were determined by CPIC in an effort to standardize terms for reporting of clinical pharmacogenetic test results (https://cpicpgx.org/resources/term-standardization/).

Translation of genotype into phenotype

Because there is sparse or no information for many allelic variants in terms of frequency and/or functionality, a subset of the most common alleles was selected for phenotype prediction (Supplementary Table S2 online). This selection was complemented by a small number of less common or rare alleles for which multiple reports were available (African, *45 or *46; East Asia, *44 and *49; multiple, *36; Africa, *40, *42; multiple, *43; Africa and Europe, *56). Alleles were grouped according to their perceived functionality; the values in parentheses indicate respective values assigned to an allele to calculate the AS14,17 of a diplotype: no-function (0) alleles (*3, *4, *4xN,*5, *6, *7, *8, *11, *12, *36, *40, *42, *56), decreased-function (0.5) alleles (*9, *10, *17, *29, *41, *44, *49), normal-function (1) alleles (*2, *35, *43, *45), and increased-function (2) alleles (*1xN, *2xN).

Studies for which CYP2D6*1 was calculated (i.e., those reporting CYP2D6*2, *41, and other commonly tested alleles) are shaded in gray in Supplementary Table S2 online (note that CYP2D6*1 calculations in Supplementary Table S1 online included studies (shaded in blue) that did not distinguish between CYP2D6*2 and *41). Next, sums of no-function (0), decreased-function (0.5), normal-function (1), and increased-function (2) alleles were calculated (Supplementary Table S2 online, columns AL-AR), unless there were sparse or no data (e.g., for the report by Gaedigk et al., row 8, no sums are shown because this article reports only on CYP2D6*42). For normal-function alleles, the frequency of CYP2D6*1 was calculated as 100 minus the sum of all other alleles (column AQ) and added to the sum of frequencies of CYP2D6*2, *35, and *4 (column AN) for a total of normal-function alleles (column AR). Allele frequencies in Figure 1 were generated from data presented in columns AL to AR of Supplementary Table S2 online.

Figure 1
Figure 1

Comparison of selected allele frequencies across world populations. Allele frequencies differ considerably between populations, as demonstrated on selected allelic variants. Frequencies were calculated from all studies (Supplementary Table S2 online; CYP2D6*2xN (column L), CYP2D6*4 (column Y); CYP2D6*5 (column AA); CYP2D6*10 (column R), and CYP2D6*17 (column S)).

Next, frequencies of diplotypes giving rise to AS were calculated using the Hardy-Weinberg equation 1 = p2 + q2 + r2 + s2 + 2pq + 2pr + 2ps + 2qr + 2qs + 2rs as shown in Supplementary Table S3 online. Frequencies predicted for each AS are summarized in Supplementary Table S2 online columns AT to BC. Columns BF to BL and BN to BW in Supplementary Table S2 online (shaded in gray) contain only data for the studies reporting minimum genotype data (*2 and a minimum of no-function (*3, *4, *5) and decreased-function (*10, *41) alleles). AS frequencies for phenotype estimates were obtained from this subset of data.

Standard deviations (SD) for the average, median, and range (minimum and maximum) were determined for each allele (Supplementary Table S2 online, shaded in yellow).


Literature review and allele-frequency tabulation

The most recent CYP2D6 allele-frequency table, published by CPIC13 (PharmGKB Knowledge Base at https://www.pharmgkb.org/guideline/PA166127636 and linked to the Human Cytochrome P450 (CYP) Allele Nomenclature Database at http://www.cypalleles.ki.se/), was updated to include reports published by 7 December 2015. Allele frequencies from a total of 177 reports were tabulated, representing approximately 60,000 unrelated subjects. Several studies yielded two or more entries (e.g., multiple ethnic groups and populations from different geographical regions), for a total number of 263 entries. Some study populations or subsets of subjects may have been reported more than once, especially when additional genotyping was performed in subsequent studies (Supplementary Table S1 online). In a few instances, frequencies in Supplementary Table S1 online differ from those published, as a result of errors identified during the curation and review process. In some instances, authors were also contacted for clarification or additional information to ensure that data were captured accurately.

Although CYP2D6 allelic variation has been extensively studied in many populations, 42% of subjects are of European origin, followed by subjects who were East Asian (24%), from the Americas (14%), African Americans (7%), from South Central Asia (6%), from Africa (3%), from the Middle East (2%), and from Oceania (1%) (calculated from Supplementary Table S2 online, column H).

Allele frequencies

The extent of genotype data varied considerably between studies (Supplementary Table S1 online). Many contained information for a single allelic variant or a small number of allelic variants, whereas others were considerably more comprehensive or utilized gene sequencing. To facilitate data summary, display, and calculations, the most commonly genotyped allelic variants and a selection of rare alleles often found only in some populations were extracted from Supplementary Tables S1 and S2 online. Allele frequencies are summarized for each study and by major ethnicity and differences of allele frequencies across populations exemplified in Figure 1. For instance, the CYP2D6*10 decreased-function allele is lowest in Oceanians and Europeans and highest in East Asians, averaging 1.6, 2.6, and 45%, respectively. By contrast, the no-function CYP2D6*4 allele is highest in Ashkenazi Jewish and European subjects and lowest in East Asian subjects, averaging 22, 18, and 0.6%, respectively.

As shown in Figure 2, allele frequencies across populations differed substantially when alleles were grouped as no-, decreased-, normal-, and increased-function alleles. However, frequencies also varied depending on whether they were determined using all reports (Supplementary Figure S1 online) or only those that genotyped for a minimum of allelic variants (i.e., CYP2D6*2 and a minimum of no-function (*3, *4, *5) and decreased-function (*10, *41) alleles). The latter differences are caused mainly by (i) overestimation of CYP2D6*1 and/or *2 owing to limited genotyping that leads to CYP2D6*1 and/or *2 default assignments (depending on which SNPs/alleles were or were not genotyped) and (ii) lack of data for no-, decreased-, and increased-function alleles that have not been tested. To predict phenotype as accurately as possible, only reports fulfilling the aforementioned minimum genotype requirements were utilized for phenotype predictions. Of the 263 entries (Supplementary Table S2 online), 76—representing almost 20,000 subjects—fulfilled these requirements. In this subset, 36% of subjects were of European origin, followed by East Asians (29%), those from the Americas (19%), African Americans (6%), Africans (3%), South Central Asians (2%), Middle Easterners (3%), and those from Oceania (1%) (calculated from Supplementary Table S2 online, column H).

Figure 2
Figure 2

CYP2D6 allele frequencies across world populations. The graph depicts frequencies of no (no)-, decreased (↓)-, normal (↔)-, and increased (↑)-function alleles that were calculated from studies reporting a minimum of allelic variants (Supplementary Table S2 online columns BF–BL). Allele frequencies represent the average of an allele in respective populations and therefore do not add up to 100%. An accompanying graph generated from all studies listed in Supplementary Table S2 online (columns AL–AR) is provided as Supplementary Figure S1 online.

Translation of genotype into phenotype

The Hardy-Weinberg equation (Supplementary Table S3 online) was used to determine the frequencies of diplotypes from the frequencies of alleles grouped by their functionality (Table 1) and an activity score (AS) assigned to each diplotype. Figure 3a depicts the frequencies for all combinations of no-, decreased-, normal-, and increased-function alleles. The allele combinations gave rise to eight AS groups (AS = 0, AS = 0.5, AS = 1, AS = 1.5, AS = 2, AS = 2.5, AS = 3, and AS = 4). Subjects with AS = 0, AS = 0.5, and AS >2 were designated as genetic poor, intermediate, and ultrarapid metabolizers (gPM, gIM, and gUM), respectively. Subjects with AS = 1, AS = 1.5, and AS = 2 were designated as genetic normal metabolizers (gNM). (To specify that the phenotype was predicted from the genotype, we utilize the prefix “g,” as previously suggested.17) Because these diplotype groups cover a wide range of activity, subjects with AS = 1 are distinguished here as gNM-slow (gNM-S), and subjects with AS = 1.5 and AS = 2 are designated as gNM-fast (gNM-F), in accordance with the report by Gaedigk et al.14 Figure 3a,b shows the frequencies of predicted phenotypes for AS groupings before and after consolidating diplotype groups. Finally, Figure 3c depicts the frequencies after consolidating AS = 1, 1.5, and 2 into a single group of normal metabolizers (gNM).

Table 1: Phenotype predictions from genotypes
Figure 3
Figure 3

CYP2D6 phenotype prediction from genotype data. An activity score (AS) was assigned to each genotype (no, ↓, ↔, and ↑ indicate genotypes with no-, decreased-, normal-, and/or increased-function allele combinations). Panel a shows average frequencies for the different allele combinations and their respective phenotype classifications into poor (gPM), intermediate (gIM), normal-slow (gNM-S), normal-fast (gNM-F), and ultrarapid (gUM) metabolizer groups. The prefix “g” indicates that the phenotype is predicted from genotype. For panel b, genotypes giving rise to AS = 1 or AS = 2 were grouped as indicated. Panel c depicts the translation of genotype or AS into phenotype according to the classification used in CPIC guidelines. Note that genotypes falling into the AS = 1 group are inconsistently classified as gIM or gNM throughout the literature.

Diplotype frequencies predicting poor metabolism were highest in Europeans (average, 5.4%) and in the Ashkenazi Jewish population (6%) and lowest in East Asians (average, 0.4), South Central Asians (average, 1%), Oceanians (average, 0.4), and subjects from the Middle East (average, 0.9) (Table 1, Supplementary Table S2 online and Figure 3). On the other end of the activity extreme, diplotypes predicting ultrarapid metabolism were highest in Oceanian (21.2%), Ashkenazi Jewish (11.5%), and Middle Eastern (11.2%) populations and lowest in subjects from East Asia (1.4%). Considerable differences were also observed for intermediate metabolizers (subjects with AS = 0.5), ranging from 2.8% (from the Americas) to 11% (African and African Americans). As discussed, there is no standard approach to grouping, especially for AS groups 0.5 and 1.

A small subset of studies performed in vivo phenotyping in addition to genotyping. The probe drug used to measure phenotype, number of subjects phenotyped, and frequencies of PMs observed are indicated in Supplementary Table S2 online (columns BZ to CB) for each study. Table 2 summarizes observed versus predicted frequencies for PMs. None of the reports for the subjects from the Middle East, Oceania, or South Central Asia had phenotype data. Phenotypic PMs are referred to here as pPMs to distinguish them from genotype-predicted PMs (gPMs). Papers reporting on phenotyping without accompanying genotype data were not included in this investigation. With the exception of Africans, the frequency of pPMs was higher than that predicted by genotype.

Table 2: Comparison of predicted and observed poor metabolizers


The materials provided by the CPIC guidelines,11,12,13 specifically the CYP2D6 allele-frequency table (Supplementary Table S2 online), is a valuable resource (look-up table) regarding genetic variation for a particular population or across major ethnic groups. Because frequencies are tabulated for each study in a standardized format, information regarding which variants have been reported by each study and which variants were not tested for is easy to retrieve. CPIC guidelines provide ample information regarding allele frequencies and how to assign an activity score and translate diplotypes into predicted phenotypes. In this report, we aimed to provide a comprehensive and detailed summary of phenotypes predicted from genotype data. Although we are providing results for major populations, the compiled materials can be utilized for further, more detailed population-specific/geographic-specific analyses by the reader.

Phenotype prediction from genotype data depends heavily on the allelic variants present in a population of interest and the alleles interrogated. Unless a study performs extensive genotyping or uses a gene resequencing approach, many less frequent, rare, and/or novel SNPs will elude detection and the frequencies of default alleles will be overestimated. For example, the CYP2D6*40 no-function allele defaults to CYP2D6*17, a decreased-function allele, or normal-function CYP2D6*1 or *2 designations if the key CYP2D6*2, *17, and/or *40 SNPs are not part of the test panel. Because CYP2D6*1 and *2 are the most common default assignments, their reported frequencies vary widely between populations of similar ethnic backgrounds and within the major ethnic groups (Supplementary Tables S1 and S2 online). For instance, in Europeans, CYP2D6*1 and *2 frequencies range from 29.2 to 100% (average, 63.1%; SD ±23.2) and from 10.5 and 40.6% (average, 27.8%; SD ±8.6). To optimize the accuracy of phenotype prediction, we included in our calculations only studies that tested for CYP2D6*2 and a minimum of no-function (*3, *4, *5) and decreased-function (*10, *41) alleles. Studies not meeting these criteria are likely to miss a considerable number of no- and decreased-function alleles and hence overestimate the normal-function CYP2D6*1 and *2 alleles. Only approximately one-third of the total number of studies and subjects tabulated fulfilled the minimal genotyping requirement. Some of the exclusions can be explained by the time when a study was performed (before certain alleles were known) or their focus on the determination of a single allele of interest (e.g., that was discovered). The impact of the amount of genotyping performed is evident in Figure 2. Similar or higher frequencies of no- and decreased-function alleles and lower frequencies of the CYP2D6*1 and *2 default alleles are consistently detected for the subset of studies reporting data for more inclusive allele panels. For African Americans, for example, the frequency of no- and decreased-function alleles increases from 9.7 and 26.3% to 15.4 and 36.3%, respectively, whereas the frequency of CYP2D6*1 decreases from 75.9 to 46.6%. Although these differences are not as dramatic for all major populations, similar trends are being observed (Figure 3 and Supplementary Table S2 online). Furthermore, our calculations do not include no- and decreased-function alleles for which there is sparse information, are population-specific (e.g., CYP2D6*14, *15, *18 or *59, Supplementary Table S1 online), or are technically difficult to genotype, such as CYP2D6*13-like CYP2D7-2D6 hybrid arrangements,18,19 and are therefore typically not included in genotyping panels. There are also numerous allelic variants of unknown or uncertain in vivo function, which may contribute to poor and/or intermediate metabolism. Hence, phenotype predictions from genotype data on a population basis underestimate the prevalence of PM and IM subjects.

Additional challenges are grouping alleles into functional categories and the resulting diplotypes into AS and phenotype categories. As discussed by Hicks et al.,17 alleles are assigned a value for classification (no, decreased, normal, and increased function) to determine the AS of a diplotype,6,14 but this rather crude system does not take into account the degree of decreased or increased function or substrate-specific activity of an allele. We classified CY2D6 *2 as a functional allele, which is consistent with its classification in CPIC guidelines11,12,13 and our findings with atomoxetine as described here. However, this or other alleles may need to be assigned a different value to more accurately reflect their activity toward a particular drug,20 which may lead to different phenotype classifications of some diplotypes and shift population profiles. In our recent study investigating the norepinephrine reuptake inhibitor atomoxetine, subjects with an AS of 0.5 presented with a total atomoxetine systemic exposure that was significantly different from that observed for subjects with an AS of 1 or 2 (ref. 21) Differences in the pharmacokinetics of atomoxetine have also been observed between groups with an AS of 1, 1.5, and 2 in East Asian subjects.22,23 Based on these findings, classification of subjects with AS of 0.5 as IMs and those with AS of 1, 1.5, or 2 as NMs can be justified for atomoxetine. For other substrates, however, allele values may need to be revised to reflect their activity toward the substrate of interest. The provided worksheets enable the reader to easily reclassify an allele of interest from, for example, functional to decreased function, and recalculate phenotype frequencies.

In general, there is a consensus on defining subjects with two no-function alleles (AS = 0) as gPMs and those with one decreased-function (AS = 0.5) or two (AS = 2) or three or more functional gene copies (AS ≥ 3) as gIMs, gNM, and gUMs, respectively. However, some investigators define diplotypes with an AS of 0 or 0.5 as poor or slow metabolizers. Subjects with AS of 1 and 1.5 diplotypes are the most controversial and are grouped as either gIMs or gNM. Likewise, diplotypes with an AS of 2.5 may be grouped as gNMs or gUMs. Although arguments can be made for and against groupings, especially when pharmacokinetic data are lacking or sparse, in the absence of standardization it can be difficult (if not impossible) to interpret results, compare reports, or draw definitive conclusions.24 This is further exaggerated if reports do not clearly describe grouping procedures and/or variants genotyped. As illustrated in Figure 3, the notable differences in frequencies of IMs and NMs, depending on grouping, underscore the importance of full disclosure of how genotype is translated into phenotype and probably also explain discordant findings in the literature comparing genotype with phenotype associations.

LLerena et al.8 analyzed interethnic variability of CYP2D6 alleles and compared predicted and measured phenotypes across world populations. Data were extracted from 172 articles that encompassed 44,572 individuals. The authors reported allele frequencies for CYP2D6*4, *10, *17, *29, and *41 that are similar to the frequencies we have determined for African Americans, East Asians, and Europeans. Those for Middle Easterners differ, which is probably due to the inclusion of additional data into our analysis. For other ethnic groups, comparisons of findings are difficult to make, mostly because of differences in grouping. Regarding phenotype, LLerena et al. focused on comparing predicted and measured phenotypes for PMs and UMs. To that end, these authors included phenotype studies that did not include genotype data. With few exceptions, the frequency of phenotypic PMs (referred to by LLerena et al. as mPM, “metabolic” phenotype and corresponding to the term “phenotypic” PMs (pPMs) used in this report) was higher than that predicted by genotype, which is in line with our findings (Table 2). For example, LLerena et al. found predicted (gPM) versus observed (pPM) frequencies of 5.75 vs. 7.08%, 6.32 vs. 8.13%, and 0.26 vs. 0.84% for Caucasian Americans, Central Europeans, and East Asians, respectively. Corresponding observations from our study were 5.44 vs. 8.45% and 0.41 vs. 0.84% for Europeans and East Asians, respectively.

Considering that genotyping does not detect all no-function alleles, and that our analysis did not take into account rare alleles if data were available in only a single report or a few reports, it is not surprising that the frequency of predicted PMs is lower than that of observed PMs. Similarly, the frequency of gIMs (i.e., subjects with AS = 0.5 diplotypes consisting of one no- and one decreased-function allele) is probably underestimated and the frequency of gUMs may be overestimated if the presence of a duplication event is defaulted to a CYP2D6*2xN (i.e., a duplication or multiplication of a normal-function allele).6,18,25 Although undetected and unknown allelic variants account for a portion of the difference between predicted and observed phenotypes, other genetic factors such as a recently described distant enhancer SNP expression26,27 and variations in the gene of the nuclear factor 4 alpha transcription factor28,29 have also been shown to impact CYP2D6 gene expression, and thus activity, and are currently not factored into phenotype prediction algorithms. Genetic variation in POR could also contribute to variability in observed cytochrome P450 oxidoreductase (CYP2D6) activity.30 Furthermore, inhibition of CYP2D6 enzymatic activity by concomitant medication(s) can dramatically change a person’s metabolic capacity presenting with a phenotype that does not match with the genetic phenotype.20,31,32 There is also growing evidence that transient or chronic conditions such as pregnancy and inflammation can alter the activity of drug-metabolizing enzymes including CYP2D6 that may lead to phenocopying.33,34,35,36 The implication of phenocopying for genotype–phenotype association studies has been reviewed by Shah et al.,37 and the impact of physiological, pathological, and environmental factors on CYP2D6 expression and activity and their implications for precision medicine have been reviewed by He et al.3

The accuracy of actionable CYP2D6 phenotypes prediction by genetic analysis could be further improved by extending allele coverage of test panels or utilizing sequence-based approaches such as those recently described38,39 and complement such efforts with gene copy-number-variation analysis capable of determining gene copy number, allelic variation of the copied/multiplied gene copies, and rearranged gene structures. However, even the most advanced genetic test strategy will have its limitations unless other factors are integrated, including genetic variation in other loci contributing to the transport, metabolism, and disposition of a substrate/drug of interest, allele/drug-specific metabolism, and host factors such as comedication and underlying physiological and pathological conditions.

In summary, we present CYP2D6 phenotype predictions based on the most comprehensive, literature-based, allele-frequency compilation available. Data are presented for each included report, making the provided materials a highly valuable resource for allele-frequency distributions and phenotype predictions across and between populations. Interpretation of the functional consequences of predicted phenotypes for drug clearance in vivo, and thus translation into specific dosing guidelines for individual drug–diplotype pairs, will benefit from future genotype-stratified pharmacokinetic studies for high-priority drugs.


The authors declare no conflict of interest.


  1. 1.

    Polymorphism of human cytochrome P450 2D6 and its clinical significance: part II. Clin Pharmacokinet 2009;48:761–804.

  2. 2.

    Polymorphism of human cytochrome P450 2D6 and its clinical significance: Part I. Clin Pharmacokinet 2009;48:689–723.

  3. 3.

    , , , Impact of physiological, pathological and environmental factors on the expression and activity of human cytochrome P450 2D6 and implications in precision medicine. Drug Metab Rev. 2015:1–50.

  4. 4.

    , , , , The human debrisoquine 4-hydroxylase (CYP2D) locus: sequence and identification of the polymorphic CYP2D6 gene, a related gene, and a pseudogene. Am J Hum Genet 1989;45:889–904.

  5. 5.

    , Update on allele nomenclature for human cytochromes P450 and the Human Cytochrome P450 Allele (CYP-allele) Nomenclature Database. Methods Mol Biol 2013;987:251–259.

  6. 6.

    Complexities of CYP2D6 gene analysis and interpretation. Int Rev Psychiatry 2013;25:534–553.

  7. 7.

    , , , , , CYP2D6 worldwide genetic variation shows high frequency of altered activity variants and no continental structure. Pharmacogenet Genomics 2007;17:93–101.

  8. 8.

    , , , , , Interethnic variability of CYP2D6 alleles and of predicted and measured metabolic phenotypes across world populations. Expert Opin Drug Metab Toxicol 2014;10:1569–1583.

  9. 9.

    , CPIC: Clinical Pharmacogenetics Implementation Consortium of the pharmacogenomics research network. Clin Pharmacol Ther 2011;89:464–467.

  10. 10.

    , , , et al.; Clinical Pharmacogenetics Implementation Consortium. Clinical Pharmacogenetics Implementation Consortium (CPIC) guidelines for codeine therapy in the context of cytochrome P450 2D6 (CYP2D6) genotype. Clin Pharmacol Ther 2012;91:321–326.

  11. 11.

    , , , et al.; Clinical Pharmacogenetics Implementation Consortium. Clinical Pharmacogenetics Implementation Consortium guidelines for cytochrome P450 2D6 genotype and codeine therapy: 2014 update. Clin Pharmacol Ther 2014;95:376–382.

  12. 12.

    , , , et al.; Clinical Pharmacogenetics Implementation Consortium. Clinical Pharmacogenetics Implementation Consortium guideline for CYP2D6 and CYP2C19 genotypes and dosing of tricyclic antidepressants. Clin Pharmacol Ther 2013;93:402–408.

  13. 13.

    , , , et al.; Clinical Pharmacogenetics Implementation Consortium. Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for CYP2D6 and CYP2C19 Genotypes and Dosing of Selective Serotonin Reuptake Inhibitors. Clin Pharmacol Ther 2015;98:127–134.

  14. 14.

    , , , , , The CYP2D6 activity score: translating genotype information into a qualitative measure of phenotype. Clin Pharmacol Ther 2008;83:234–242.

  15. 15.

    , , , , , Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet 2005;1:e70.

  16. 16.

    , , , et al. Genetic structure of human populations. Science 2002;298:2381–2385.

  17. 17.

    , , Challenges in CYP2D6 phenotype assignment from genotype data: a critical assessment and call for standardization. Curr Drug Metab 2014;15:218–232.

  18. 18.

    , , , , , CYP2D7-2D6 hybrid tandems: identification of novel CYP2D6 duplication arrangements and implications for phenotype prediction. Pharmacogenomics 2010;11:43–53.

  19. 19.

    , , , et al. Identification of novel CYP2D7-2D6 hybrids: non-functional and functional variants. Front Pharmacol 2010;1:121.

  20. 20.

    , , , et al. In vivo assessment of the metabolic activity of CYP2D6 diplotypes and alleles. Br J Clin Pharmacol 2015;80:1122–1130.

  21. 21.

    , , , , , Single dose, CYP2D6 genotype-stratified pharmacokinetic study of atomoxetine in children with ADHD. Clin Pharmacol Ther 2016;44:1070–1079.

  22. 22.

    , , , et al. Effects of the CYP2D6*10 allele on the pharmacokinetics of atomoxetine and its metabolites. Arch Pharm Res 2015;38:2083–2091.

  23. 23.

    , , , et al. Pharmacokinetics, safety, and tolerability of atomoxetine and effect of CYP2D6*10/*10 genotype in healthy Japanese men. J Clin Pharmacol 2012;52:388–403.

  24. 24.

    , , , et al. Pharmacogenetic allele nomenclature: International workgroup recommendations for test result reporting. Clin Pharmacol Ther 2016;99:172–185.

  25. 25.

    , Gene copy number variations: it is important to determine which allele is affected. Pharmacogenomics 2011;12:299–301.

  26. 26.

    , , Functional characterization of CYP2D6 enhancer polymorphisms. Hum Mol Genet 2015;24:1556–1562.

  27. 27.

    , , , , , Common CYP2D6 polymorphisms affecting alternative splicing and transcription: long-range haplotypes with two regulatory variants modulate CYP2D6 activity. Hum Mol Genet 2014;23:268–278.

  28. 28.

    , , , et al. Genetic polymorphism of hepatocyte nuclear factor-4alpha influences human cytochrome P450 2D6 activity. Hepatology 2008;48:635–645.

  29. 29.

    , , , et al. In silico and in vitro identification of microRNAs that regulate hepatic nuclear factor 4α expression. Drug Metab Dispos 2012;40:726–733.

  30. 30.

    , , , et al. Effects of genetic variants of human P450 oxidoreductase on catalysis by CYP2D6 in vitro. Pharmacogenet Genomics 2010;20:677–686.

  31. 31.

    , , , et al. Composite functional genetic and comedication CYP2D6 activity score in predicting tamoxifen drug exposure among breast cancer patients. J Clin Pharmacol 2010;50:450–458.

  32. 32.

    , , , , , Genotype and co-medication dependent CYP2D6 metabolic activity: effects on serum concentrations of aripiprazole, haloperidol, risperidone, paliperidone and zuclopenthixol. Eur J Clin Pharmacol 2016;72:175–184.

  33. 33.

    , Drug metabolism and transport during pregnancy: how does drug disposition change during pregnancy and what are the mechanisms that cause such changes? Drug Metab Dispos 2013;41:256–262.

  34. 34.

    , , , et al. Coordinating role of RXRα in downregulating hepatic detoxification during inflammation revealed by fuzzy-logic modeling. PLoS Comput Biol 2016;12:e1004431.

  35. 35.

    , Inflammation-induced phenoconversion of polymorphic drug metabolizing enzymes: hypothesis with implications for personalized medicine. Drug Metab Dispos 2015;43:400–410.

  36. 36.

    , , , ; National Institute for Child Health and Human Development Network of Maternal-Fetal-Medicine Units. Temporal changes in drug metabolism (CYP1A2, CYP2D6 and CYP3A Activity) during pregnancy. Am J Obstet Gynecol 2005;192:633–639.

  37. 37.

    , , , , , CYP450 genotype and pharmacogenetic association studies: a critical appraisal. Pharmacogenomics 2016;17:259–275.

  38. 38.

    , , , et al. Long-read single-molecule real-time (SMRT) full gene sequencing of cytochrome P450-2D6 (CYP2D6). Hum Mut2016;37:315–323.

  39. 39.

    , , , et al. Constellation: a tool for rapid, automated phenotype assignment of a highly polymorphic pharmacogene, CYP2D6, from whole-genome sequence. Genomic Med 2016. doi:10.1038/npjgenmed.2015.7 http://www.nature.com/articles/npjgenmed20157.

Download references


T.E.K., K.S., and M.W.C. acknowledge financial support from R24 GM61374 and R24 GM115264.

Author information


  1. Division of Clinical Pharmacology, Toxicology & Therapeutic Innovation, Children’s Mercy–Kansas City, Kansas City, Missouri, USA

    • Andrea Gaedigk
    •  & J. Steven Leeder
  2. School of Medicine, University of Missouri–Kansas City, Kansas City, Missouri, USA

    • Andrea Gaedigk
    •  & J. Steven Leeder
  3. Department of Genetics, Stanford University, Stanford, California, USA

    • Katrin Sangkuhl
    • , Michelle Whirl-Carrillo
    •  & Teri Klein


  1. Search for Andrea Gaedigk in:

  2. Search for Katrin Sangkuhl in:

  3. Search for Michelle Whirl-Carrillo in:

  4. Search for Teri Klein in:

  5. Search for J. Steven Leeder in:

Corresponding author

Correspondence to Andrea Gaedigk.

Supplementary information

About this article

Publication history






Further reading