Introduction

An increasing number of clinically relevant association between drug response and genomic variation has been reported over the past years, resulting in evidence-based pharmacogenetic guidelines [1, 2]. For instance, the Pharmacogenomics Knowledge base PharmGKB (https://www.pharmgkb.org) has collected and curated information for more than 740 drugs and, to date, contains 189 clinical guidelines and 868 drug label annotations approved by various pharmaceutical regulatory organizations such as the US Food and Drug Administration (FDA) or the European Medicines Agency (EMA). Nevertheless, although many patients would benefit from pharmacogenetics-based prescription policy [3], only limited applications are observed in clinical practice, especially in primary care [4,5,6,7]. Indeed, genetic testing is in most cases performed retrospectively when adverse side effects arise or when a drug lacks efficacy. Main barriers to the implementation of pharmacogenetics into routine clinical practice are the lack of awareness and education of physicians and pharmacists, solid scientific evidence of pharmacogenomic biomarkers, harmonized and implementable pharmacogenomic guidelines and in some instances, the absence of a dedicated infra-structure to integrate pharmacogenetics testing into the workflow of health care providers [6, 8]. Seminal studies have notably shown the importance of common genetic variants affecting phase I or phase II enzymes in the resistance to various pharmacological agents or the occurrence of life-threatening side effects [9]. Prominent examples include the association between common defective TPMT alleles and the risk of hematotoxicity following 6-mercaptopurine exposure [10] or the impact of frequent specific CYP2C19 polymorphisms on clopidogrel efficacy [11]. Nevertheless, these common genetic variants, while important, only account for little of the inherited individual variation in drug response and a substantial fraction of the genetically encoded variability in drug pharmacokinetics remains to be elucidated. Interestingly, recent large-scale studies have unveiled that more than 90% of the genetic variability in genes associated with drug metabolism and disposition is assigned to rare genetic variants, but the functional impact of such rare pharmacogenetic variants on drug response remains poorly documented.

Fluoropyrimidine-based treatment regimens are the standard therapy for many distinct types of advanced solid tumors including breast, colorectal as well as head and neck cancers [12]. Nevertheless, up to 30% of patients will experience serious adverse drug reactions such as diarrhea, stomatitis, mucositis, myelosuppression or neurotoxicity, which can be lethal in 0.5–1% of cases [12, 13]. Dihydropyrimidine dehydrogenase (DPD), the initial and rate limiting enzyme involved in the catabolism of 5-fluorouracil (5-FU), is responsible for the elimination of 80–85% of the administered dose. Plasma concentrations of uracil ([U]), the endogenous substrate for DPD, or its product dihydrouracil (UH2) are routinely used as a surrogate marker for systemic DPD activity [14]. Indeed, pretreatment [U] and [UH2]/[U] ratio are highly correlated with systemic DPD activity and many studies have shown a relationship between fluoropyrimidine-induced toxicity and a DPD phenotype characterized by high [U] or low [UH2]/[U] ratio [14, 15]. However, the equipment required as well as the recommended pre-analytical conditions for the measurement of [U] and [UH2] are not widely available in many clinical laboratories [16, 17]. Therefore, implementation of alternative approaches such as DPYD-based pharmacogenetic assays are convenient complementary methods to accurately predict DPD activity [16]. Indeed, according to PharmGKB, more than 20 loss-of-function DPYD variants have been reported to alter DPD enzymatic activity, and consequently patients harboring such variants are exposed to an increased risk of severe toxicity when receiving standard dose of fluoropyrimidine. For this reason, international guidelines now recommend pre-emptive DPYD genotyping for several clinically relevant defective variants: i.e., c.1905+1G>A (DPYD*2A), c.1679T>G (DPYD*13), c.2846A>T, and Haplotype B3 (c.1236G>A or c.1129–5923C>G) as well as genotype-guided prescribing recommendations [17, 18].

In this study, using Next Generation Sequencing (NGS), we comprehensively assessed the relationship between DPYD genotype and DPD phenotype in a series of 2 972 patients and identified new rare clinically relevant variants associated with DPD deficiency. Our results also show that rare DPYD genetic variants account for a significative part of the interindividual variability of DPD activity. Therefore, comprehensive NGS-based genotyping instead of candidate SNP interrogation should be considered for the guidance of personalized fluropyrimidine therapy.

Materials and methods

Studied cohort

All patients included in this study were eligible for an uracil analog-based chemotherapy (Supplementary Table S1). Only those for which both DPYD genotype and DPD phenotype were available were included. The protocol has been certified to be in accordance with French laws by the Institutional Review Board of Centre Hospitalier Universitaire de Lille (France). Genotyping analysis and DPD phenotyping were performed as described in our local regular protocol to identify DPD-deficient patients at increased risk of severe fluoropyrimidine-induced toxicity. However, information regarding fluoropyrimidine toxicity was not available. All patients provided their written informed consent for genetic analysis and to publish this paper in accordance with institutional guidelines and the Declaration of Helsinki and Istanbul. The DNA collection was registered by the Ministère de l’Enseignement Supérieur et de la Recherche (Paris, France) under the number: DC-2008–642.

DPD phenotyping

Pretreatment Plasma Uracil [U] and dihydrouracil [UH2] were quantified using a Waters TQD UPLC®-MS/MS System (Waters Corp., Milford, MA, USA) equipped with an electrospray ionization interface according to the method described by Coudore et al. [19]. Data acquisition and processing were performed using MassLynx v.4.0 software. DPD activity was categorized as normal, partial or complete deficiency based on previous reports using the [UH2]/[U] ratio [20,21,22,23,24,25,26]. Indeed, although no consensual cut-off values for the [UH2]/[U] ratio has been established yet, a [UH2]/[U] ratio cut-off below or equal to 10 was chosen for DPD deficiency as it has been previously demonstrated as a good predictor of fluoropyrimidine toxicity [15, 27]. Therefore, partial DPD deficiency was defined as [UH2]/[U] ≤ 10 whereas complete DPD deficiency was defined as [UH2]/[U] ≤ 1. Alternatively, DPD activity can also be estimated by measuring [U] and a cut-off value over or equal to 16 µg/mL is used to define partial deficiency and over 150 µg/mL for complete deficiency [15].

DPYD genotyping

All patients gave their written informed consent for genetic testing. Genomic DNA was extracted from peripheral blood using Chemagic Star (Chemagen, Baesweiler, Germany) and then quantified using the NanoDrop® spectrophotometer (ThermoFisher Scientific, Waltham, MA, USA) according to the manufacturer’s instructions. Genomic sequence of the DPYD gene was retrieved from the NCBI website and the Reference Sequence NG_008807.2 was subsequently used. Primers were designed to include all exonic regions and at least 30 bp of each flanking intron using Fluidigm D3™ assay design web-based tool. A total of 64 unique primer pairs were created and are listed in Supplementary Table S2. Custom-designed primer pairs to target DPYD exonic regions and exon–intron boundaries were designed and optimized for the Fluidigm Access Array (Fluidigm, South San Francisco, CA, USA). Amplification of genomic DNA was performed in up to 10-plex PCR reaction wells, followed by addition of barcode indexes and sequencing adaptors by further PCR according to manufacturer’s instructions. Pooled amplicons were harvested and diluted to prepare unidirectional libraries for 150 base-pair (bp) paired-end sequencing on Illumina MiSeq sequencing platform (Illumina, San Diego, CA, USA). Illumina NGS reads were trimmed for base Phred quality control (mean quality in a 30 bp sliding window >20 and 3′ base quality ≥6) and aligned with Burrows–Wheeler Aligner (v0.6.1-r112-master) on hg19 human genome reference sequence. Variant-calling was achieved using MiSeq Reporter v2.6, GATK v3.7 or GATK v4.1.4.0 (Genome Analysis Toolkit) [28] without downsampling or removal of PCR duplicates; variants with quality/depth < 5 or depth < 30 were filtered. All very rare (MAF ≤ 0.1%) and novel variants identified by NGS analysis were validated by Sanger sequencing (Table S2). The functional consequences of each variant were estimated by in silico analysis, using bioinformatic prediction tools such as SIFT, PolyPhen-2 or CADD and on the basis of the ACMG classification.

Statistical analyses

Sample size was chosen empirically based on our previous experiences in the calculation of experimental variability; no statistical method was used to predetermine sample size and no samples or data points were excluded from the reported analyses. Data are described as the medians ± standard deviations, or n (%). Since [U] and [UH2]/[U] values were not normally distributed, non-parametric tests were performed. Allelic frequencies and genotype distribution were estimated by gene counting and tested for Hardy–Weinberg equilibrium. For the comparison of proportions and to evaluate the Hardy–Weinberg equilibrium, we used the chi-square test. As in most cases, a low number of individuals carries the alternate allele homozygote, the influence of the genotypes on DPD activity was assessed by clustering genotypes into a dominant inheritance model. Then, genotypes were compared using non parametric Mann–Whitney and Kruskal–Wallis tests. The level of significance was set at p < 0.05. All analyses were two-sided. Statistical analyses were performed using Prism® 5.0 (GraphPad) and JMP (SAS) software.

Results

Inter-individual variability of pretherapeutic DPD enzyme activity

This retrospective study included 2972 subjects. Mean patient age was 65 ± 11 years, and the sex ratio (M/F) was 1.2 (Supplementary Table S1). Using a cut-off value below or equal to 10 for the [UH2]/[U] ratio, 580 patients (19.7%) were categorized with partial DPD deficiency, whereas no patient exhibited complete DPD deficiency. Mean age did not significantly differ between the partial DPD deficiency group and the normal DPD group (Supplementary Table S1). Overall, [U] and [UH2]/[U] values identified 628 patients (21.1%) with DPD deficiency, but these parameters were in agreement in only 114 (18.2%) patients (Table 1). Indeed, 466 (15.7%) patients presented [UH2]/[U] ≤ 10 and [U] < 16 ng/mL, and 48 (1.6%) presented [UH2]/[U] > 10 and [U] ≥ 16 ng/mL (Table 1). The [UH2]/[U] level below which [U] values were all ≥ 16 ng/mL was 4.6, and the [U] level above which [UH2]/[U] values were all ≤ 10 was 49 ng/mL, suggesting that a better agreement between [UH2]/[U] and [U] values to identify DPD deficiency would require the use of more restrictive thresholds. Based on these results, the current cut-off values for [U] and [UH2]/[U] do not identify DPD deficiency in an equivalent manner, and a [UH2]/[U] ratio ≤ 10 yields a higher proportion of individuals classified with partial DPD deficiency than [U] levels > 16 ng/mL.

Table 1 Number of patients according to the uracil plasma concentration ([U]) and the dihydrouracil/uracil ([UH2]/[U]) plasma ratio.

Genetic variants identified in DPYD

The group of patients with partial DPD deficiency represented a total of 580 patients, including 134 wild-type patients (DPYD*1/*1) and 446 patients harboring at least one genetic variant (208 patients carried one genetic variant and 238 patients more than one). Overall, genetic variants identified in patients with partial DPD deficiency represent a total of 809 variants. The remaining 2392 patients exhibiting normal DPD activity include 623 wild-type patients (DPYD*1/*1) and 1769 mutated patients in which a total of 3183 genetic variants were identified (831 carrying a single genetic variant and 938 carrying more than one). The mean coverage (read depth) of the identified genetic variants was 1130 (range: 33–4995) for the group of patients with DPD partial deficiency and 1131 (range: 33–7612) for group of patients whose phenotype was unaltered. 30 distinct genetic variants were identified in the group of patients exhibiting partial DPD deficiency (29 single nucleotide polymorphisms and one indel). Among these genetic variants, 23% (7/30) were common (MAF ≥ 1%) and 77% (23/30) were considered as rare /very rare or novel (MAF < 1%), and among these, 58% (13/23) were classified as deleterious according to variant effect prediction algorithms (Table 2). In addition, the majority of variants were missense (77%; 23/30), one was non-sense, one was categorized as indel and two were located in canonical splice sites. Among the remaining variants, 10% (3/30) were synonymous. In the group of patients exhibiting a normal DPD phenotype, 58 unique genetic variants were identified including 56 single nucleotide polymorphisms and two indel. 12% (7/58) were common whereas 88% (51/58) were considered as rare/very rare or novel (MAF < 1%) including 35% (18/51) classified as deleterious by functional prediction algorithms. In addition, the majority of variants were missense (55%, 32/58), two were non sense and six were located in canonical splice sites. Among the remaining variants, 29% (17/58) were synonymous and 2% (1/58) were located in the UTR (Untranslated Regions). All rare genetic variants were heterozygous. Hardy–Weinberg equilibrium for each common and rare variant and allelic frequencies are reported in Supplementary Table S3. As the French law of information and freedom prohibits to collect information on ethnicity, it was thus impossible to provide data frequency according to patient ancestry. We thus made the assumption that our population was mainly European (Supplementary Table S3).

Table 2 List of the genetic variants identified in DPYD by next generation sequencing.

Association between the most clinically relevant DPYD defective variants and DPD deficiency

Dose adjustment based on pretreatment screening for the most clinically relevant DPYD defective variants, i.e. c.1679T>G (DPYD*13, rs55886062), c.1905+1G>A (DPYD*2A, rs3918290) and c.2846A>T (p.Asp949Val or rs67376798), has been shown to improve the safety of chemotherapy regimens based on fluorouracil [29]. Accordingly, international recommendations now provide indications for drug-related genetic tests and DPYD genotype-guided dosing in routine clinical practice [17, 18]. As expected, our data showed a significant association between each of these genetic variants and low DPD activity (Fig. 1).

Fig. 1: Association between the most clinically relevant DPYD defective rare variants and DPD deficiency.
figure 1

Box plot showing DPD pretreatment activity assessed by the dihydrouracil/uracil ([UH2]/[U]) plasma ratio according to the patient genotype. The box represents the 25–75% quartiles, the line in the box represents the median, whiskers represent the range. The red dash line indicates the ratio threshold used to categorize patients as having partial DPD deficiency (ratio ≤10) or normal DPD activity (ratio>10). n = number of patients; ***P < 0.001; ****P < 0.0001.

Association between common DPYD genetic variants and DPD deficiency

The association between common DPYD genetic variants (MAF ≥ 1%) and DPD activity is summarized in Fig. 2. Among the seven genetic variants identified, three variants (c.1236G>A or rs56038477 p.Glu412Glu ; c.496A>G or rs2297595 p.Met166Val; DPYD*6 c.2194G>A or rs1801160 p.Val732Ile) were significantly more frequent in the group of patients exhibiting partial DPD deficiency. Consistent with previous reports, the c.1236G>A (rs56038477) which is included in the risk haplotype B3 was significantly associated with low DPD activity [30, 31]. Nevertheless, compared to the most clinically relevant DPYD defective variants, the association of these three variants with DPD activity was rather modest (Fig. 2).

Fig. 2: Association between common DPYD genetic variants and DPD deficiency.
figure 2

Box plot showing DPD pretreatment activity assessed by the dihydrouracil/uracil ([UH2]/[U]) plasma ratio according to the patient genotype. The hapB3 haplotype is represented in yellow whereas the other common variants are in green. The box represents the 25–75% quartiles, the line in the box represents the median, whiskers represent the range. The red dash line indicates the ratio threshold used to categorize patients as having partial DPD deficiency (ratio ≤10) or normal DPD activity (ratio >10). n = number of patients, ns = non-significant; *P < 0.05, **P < 0.01.

Association between rare, very rare and novel DPYD genetic variants and DPD deficiency

The list of frequent (MAF ≥ 1%), rare (MAF < 1%) and very rare (MAF ≤ 0.1%) variants identified in the DPYD gene in the whole cohort is summarized in Table 2. The number of patients in each group is summarized in Fig. 3A. Variants with a MAF below 1% were found to be enriched in patients exhibiting low DPD activity (9.3% versus 3.2% ; P < 0.00001) (Fig. 3B). This remained significant when excluding the rare clinically relevant DPYD defective variants (4.5% versus 2.6% ; P < 0.03). As many rare variants are likely to have little to no impact on DPD activity, a similar analysis including variants with a MAF below 1% and a putative deleterious impact on DPYD function according to CADD score (threshold above 15) was performed after excluding the rare clinically relevant DPYD defective variants. Indeed, a CADD score above 15 has been previously shown as a good prediction tool for pharmacogenetic variants [32]. Not surprisingly, these were more common in the group of patients with low DPD activity (4.2% versus 1.6% ; P < 0.001) (Fig. 3C). Overall, our results indicate that rare DPYD genetic variants account for a significative part of the interindividual variability of DPD activity.

Fig. 3: Association between DPYD genetic variant frequency and pretreatment DPD activity.
figure 3

(A) Flow chart showing the distribution of all identified DPYD genetic variants according to the minor allele frequency (MAF) in the groups of partial DPD deficiency ([UH2]/[U] plasma ratio below or equal to 10) and normal DPD activity ([UH2]/[U] plasma ratio above 10) (number of patients are reported) (B) Distribution of DPYD genetic variants based on minor allele frequency (MAF) below 1% according to pretreatment DPD activity (number of patients and percentage are reported). (C) Distribution of the DPYD genetic variants with a MAF below 1% and predicted to impact DPD activity (CADD score > 15) in the group of patients exhibiting normal or low DPD activity (number of patients and percentage are reported).

Discussion

Innovative and collaborative research efforts over the last decades have substantially improved our understanding of the role played by inherited genetic changes on the interindividual variability in drug efficacy or toxicity [33]. Large scale sequencing studies have notably shown that single-nucleotide variants are the most common form of protein-altering “functional variants” identified among genes relevant to the drug pharmacokinetics and pharmacodynamics, also known as pharmacogenes [33, 34]. Of particular interest, results from these studies have also revealed that rare genetic variants account for a substantial part of the unexplained interindividual differences in drug response, but their exact contribution on drug pharmacokinetics has not been systematically evaluated and remains thus poorly understood [33,34,35,36]. In this study, we focused on dihydropyrimidine dehydrogenase, a key enzyme in the metabolic catabolism of the chemotherapeutic agent 5-FU or its prodrugs, whose complete deficiency is associated with impaired clearance of 5-FU, excessive drug accumulation and severe toxicity.

Various genotyping and phenotyping approaches have been developed to assess DPD deficiency in order to reduce the incidence of severe toxicity without affecting treatment efficacy by dose tailoring fluoropyrimidine-based therapy. Although various uracil-based methods are routinely used in various countries to predict DPD deficiency, clinical relevance of pretreatment DPD phenotyping by these assays remains controversial [30]. Indeed, optimal cutoff levels that predict toxicity have not been validated yet and previous studies have shown extensive variability in uracil measurements when different cohorts were compared [12, 18, 37, 38]. In line with this, de With et al. [39]. very recently raised important issues against the utility of uracil-based assays in clinical practice given the large inter-center variability observed in measured pretreatment uracil levels. By contrast, the clinical validity of genotype-based approaches has been established in multiple metaanalyses as well as in large prospective studies [39, 40]. Results from these studies have in particular shown that prediction of DPD enzyme activity by molecular genetic testing in routine clinical practice is a reliable method that not only significantly improves patient safety but is also cost‐effective [41]. Consequently, clinical practice guidelines now recommend pre-emptive DPYD genotyping especially in Europe, where these four DPYD deficient alleles are relatively common in individuals of Caucasian ancestry [42]. Nevertheless, even using this strategy, prediction of fluoropyrimidine-induced toxicity remains suboptimal to detect all patients at risk of toxicity [43]. In this context, we aimed to assess whether rare genetic variants significantly contribute to the large interindividual variability of DPD enzyme characterizing a series of about 3 000 patients using new sequencing technologies.

Next Generation Sequencing (NGS) refers to a wide range of technologies enabling rapid and high-throughput sequencing of DNA [44]. In recent years, NGS has been successfully used to comprehensively interrogate the entire spectrum of genomic variations in pharmacogenes including rare variants [33]. In line with this, we applied an NGS-based approach to capture rare and common genetic variations located either in the coding sequence of the DPYD gene or its flanking intronic regions. Specifically, our results confirmed the strong impact of the three clinically rare variants. Additionally, although a significant association between DPD activity and three common known variants including Haplotype B3 was also shown in our large series of patients, their modest effect on DPD activity raises the question of their clinical relevance. Therefore, we suggest additional studies to clarify their use in prospective DPYD genotyping, especially as our study may be biased by several confounding factors. Of particular interest, our results also showed the importance of considering rare DPYD genetic variants to predict the risk of 5-FU toxicity. This is in agreement with results from sequencing data established in large distinct populations, which showed that the vast majority of variants among pharmacogenes are rare (MAF < 1%) or very rare (MAF ≤ 0.1%) and non-synonymous, with an estimated 30-40% of functional variability likely attributed to these rare variants [45]. For example, resequencing of 202 drug target genes in about 14 000 individuals showed that more than 95% of the identified variants had a MAF below 0.5% and that 90% of those were not known [46]. In light of our results, we suggest that additional studies should be performed to assess the association between rare DPYD genetic variants and fluoropyrimidine toxicity. This point is indeed of importance and represents one limitation of our study, as we could only assess the relationship between rare genetic variants and DPD activity.

In conclusion, our results strongly suggest that integrating rare genetic variants into routine pharmacogenetic testing can significantly improve the prediction of DPD enzyme activity. Therefore, we advocate that pre-emptive screening of DPD deficiency should be based on a more comprehensive genotyping approach, combined with phenotyping strategies, to ensure the safe administration of fluoropyrimidines.