The Predictive Value of Genetic Analyses in the Diagnosis of Tetrahydrobiopterin (BH4)-Responsiveness in Chinese Phenylalanine Hydroxylase Deficiency Patients

Molecular characterization of PAH deficiency has been proven essential in establishing treatment options. We examine the diagnostic accuracy of two genetic assays to predict BH4 responsiveness: to determine whether the AV sum test or mutation-status assessment test can obviate the need for BH4 loading in Chinese patients. The overall predicted response in 346 patients was 31.65% by the AV sum test and 25.43% by the other assay; both percentages were lower than 51.06% derived from loading results in 94 patients. Responders were compound heterozygotes with definite BH4 responsive mutations, while non-responders had null/null ones; some consistently with specific mutations and genotypes. The sensitivity and specificity of the assays were 81.1% and 92.5% for the AV sum, and 82.9%, 97.3% for the other. An AV sum cutoff >2 has a positive predictive value (PPV) of 90.9%, while the presence of at least one BH4 responsive mutation has a PPV of 97.1%. The two approaches showed good concordance. Our data confirmed that the mutation-status assessment has a higher diagnostic accuracy in predicting response for Chinese patients than the AV sum test. BH4-responsiveness may be predicted or excluded from patients’ molecular characteristics to some extent, thus some patients may avoid the initial loading.

Therefore, it is of crucial importance to detect this subtype of patients correctly and to determine the individual sensitivity to BH4; a 30% reduction or more in blood Phe concentration 24 hours after administration of sapropterin is the most commonly used biochemical definition of a positive response 20 . Since the corresponding work of the test is onerous, finding a rational testing modality is imperative for further application. As PAH gene (NG_008690.1) mutations have been characterized, the association of these mutations with disease phenotypes has been uncovered. It was acknowledged that although different biochemical phenotypes were observed in identical genotypes, their BH4 responsiveness was always the same except in a very small percentage of patients 21 Phenotypes were stratified as mild hyperphenylalaninemia (MHP), mild phenylketonuria (mPKU), and classical PKU (cPKU) based on the pretreatment plasma Phe values. c An arbitrary value (AV) if available was assigned to each mutation: AV = 1 for classical PKU mutation; AV = 2 for moderate PKU mutation; AV = 4 for mild PKU mutation, and AV = 8 for MHP mutation. Phenotypes resulting from a combination of the two mutant alleles were expressed as the sum of the two mutations' AVs. d The groups for the patients based on the assessment of the BH4-responsiveness-status of the detected mutations. I for "Non BH4 responsive"; II for "BH4 responsive"; III for "Undefined BH4 responsive". e Based on BH4 loading test; R: response; R+: responder; R−: non-responder. and BH4-responsiveness was a common trait among subjects with the milder forms of PAH deficiency 22,24,25 . This observation significantly increased research activities on the feasibility of using genetic analysis to assess the response 26-30 and a consensus was reached that such research might help to guide clinical decisions and promote personalized medicine in managing PAH deficiency for specific populations. The prevalence of PAH deficiency in Chinese population is about 1 in 11,572 live births 31 . Previously, our team carried out genetic analyses for patients from diverse regions of China 32, 33 , investigated the mutational spectrum, and established the genotype-phenotype correlations. These studies comprise the initial steps in the development of an optimal molecular diagnostic algorithm. In the current study, we used two assays to predict the prevalence of BH4 response in a cohort of 346 Chinese PAH deficient patients. The assigned values (AV) sum approach 34 is a genotype severity tool in which the predicted phenotype for a patient was expressed numerically as the sum of two mutations' AVs. The results for the predicted phenotypes for all our patients were derived from Guldberg et al. 34 (see the Appendix) and our previous paper 32 . The other approach is based on the speculation that the presence of some specific mutations on at least one PAH gene copy, which is repeatedly found to be associated with BH4-responsiveness, would be sufficient for positive BH4 response. The performances of both tests in predicting the BH4 response were respectively validated by the clinical BH4 loading results in 94 of them. The accuracy of both analyses in differentiating between BH4 responsive and non-BH4 responsive patients was assessed; whether the AV sum test or mutation-status assessment test can obviate the need for BH4 loading was discerned.

Results
The 346 PAH deficient patients (Table S1) , respectively], and the remaining patients were compound heterozygotes (311/346, 89.88%). Homozygosity was found in 9.12% of our population and 8 mutations accounted for over two-thirds of the 692 alleles, with p.Arg243Gln (c.728G > A) at a 24.06% frequency. Thus, the investigation of common mutations or allelic combinations in accordance to BH4 responsiveness was of great interest to Chinese patients in general. Based on the AV sum information 32 (17/46, 39.96%) were mPKU, and none (0/46, 0%) was MHP. It was found that 9 (9/46, 19.57%) of the non-responders had a Phe reduction rate less than 20% at 8 hours after the loading and between   35 defined this situation as "slow responsiveness". Interestingly, in our study, these late responsive patients have cPKU (6) and mPKU (3). According to the phenotypic scheme, response to BH4 was detected in 15.15% (5/33) of cPKU, in 67.27% (37/55) of mPKU and in all the 6 MHP patients. The differences in the percentages of responders among the three phenotypic groups were statistically significant (χ2 = 28.56, df = 2, p < 0.001). As shown in Table 1, the reduction of Phe from the baseline among BH4 responders ranged between 30.12% and 99.75%, revealing a considerable degree of inter-subject variability. Most patients (37/48, 77.08%) had normal or nearly normal (≤360 umol/L) Phe levels 24 hours following the BH4 challenge whereas 11 responders had not. The Phe for 3 out of 5 cPKU responders and 28 out of 37 mPKU responders reached the normal range while all 6 MHP patients showed pronounced Phe reductions, indicating that the rate of Phe reduction was largely dependent on the phenotypic class of the patients. For the group of 37 responders (Fig. 1a), blood Phe level was 642.32 ± 294.23 umol/L during pre-treatment and 688.65 ± 216.23 umol/L at time 0 hour of BH4 loading. The latter number was reduced to 170.62 ± 83.80 umol/L at 24 hours after loading. Data analysis of this group suggested a significant correlation between pre-treatment Phe and the percentage of Phe reduction after challenge (r = 0.378, P < 0.05) as opposed to before-BH4 (time at 0 hour) Phe and the extent of response which did not seem to have a statistically significant correlation (r = 0.08, P > 0.05) (The statistical correlation between pre-treatment Phe and before-BH4 (time at 0 hour) was excluded (r = −0.209, P > 0.05)). The group of the other 11 responders (Fig. 1b) had a pre-treatment Phe concentration of 1110.00 ± 413.54 umol/L and 1238.82 ± 710.33 umol/L at 0 hour of BH4 loading. The latter number was reduced to 733.07 ± 454.64 umol/L at the end of the test. Significant correlations existed between pre-treatment Phe and post-loading Phe level (time at 24 hour) (r = 0.777, P < 0.01), and between pre-BH4 Phe and Phe level at 24 hours (r = 0.981, P < 0.001), which was inconsistent with the view derived from the study of Leuzzi, V. et al. 36 .

Classification of Mutations and Genotypes for BH4 Response. Genetic analysis revealed 45 mutations
and identified 66 genotypes in these 94 patients ( and p.Arg413Pro (c.1238G > C) as constantly non-BH4 responsive variants. Both classifications were unambiguous because the response was consistent in two or more functionally hemizygous patients. Since p.Arg408Gln (c. 1223G > A) was found in two compound-heterozygous patients, and the following mutations including p.Ar-g155His (c.464G > A), p.Arg158Gln (c.473G > A), p.Pro275Ala (c.823C > G), p.Met276Lys (c.827T > A), p.Ile-324Asn (c.971T > A) and p.Gln419Arg (c.1256A > G) were found in only one functionally hemizygous patient, the assignment of these variants to any specific BH4 responsive category may therefore be uncertain.   Table 1. The majority of responders (81.08%, 30/37) had an AV sum >2, indicating that at least one mutation is moderate or mild in severity. The remaining seven responders carried two severe mutations (AV = 1) and were given an AV sum of 2. BH4 responders represented a specific heterogeneous group, with the AV sum ranging from 2 to 8, 8 being the highest AV sum present in a homozygote for p.Arg241Cys. In contrast, the AV sum for the majority of non-responders (92.5%, 37/40) was 2, indicating a severe PAH genotype. The other three non-responders who carried a moderate mutation in combination with a severe one had an AV sum of 3. Then, we analyzed ROC curve (Fig. 2) to verify the efficiency of this approach and to determine the cutoff value of the AV sum that differentiated PAH deficient patients with and without BH4 response. The AUC value was 0.891 (95% confidence interval (95% CI) = 0.811-0.972)). The optimal cutoff level of the AV sum, determined by the curve to calculate sensitivity and specificity, was set at 2.5. Since AV sums ranged from 2 to 8, with a whole number indicating the patient phenotype, we used AV sum >2 as the cutoff. The sensitivity and specificity we found at this level was 81.1% and 92.5%, which likely resulted in a positive predictive value (PPV) as opposed to a negative predictive value (NPV) of 90.9% and 84.1% (Table 3).  Table 2). We note that the observed phenotype of the patient did not always match his predicted
Finally, we used Kappa test to examine the consistency between two approaches in classification of the clinical BH4 response status. Kappa test revealed that the results of the AV sum approach showed good consistency with those of the mutation BH4-responsiveness-status assay (Kappa test, kappa = 0.71, P < 0.001).

Discussion
Sapropterin has been available as a non-dietary therapy option for patients with BH4-responsive PKU since 2007 in the United States, 2008 in the European Union, and 2010 in Canada. It is prescribed to patients 4 years of age and onwards 4 . The implementation of pharmacological intervention at a younger age has been recently indicated to be feasible and beneficial 37 . It is also noted that sapropterin could be considered as a treatment option in pregnant women with PKU who cannot achieve the recommended ranges of blood Phe level with dietary therapy alone 38 . So a careful selection of patients who are eligible for the therapy using sound procedures (with low false-negatives and false-positives) is imperative. The goal of this study is to investigate the characteristics of BH4-responsiveness in Chinese patients and to determine the predictive value of genetic analysis for BH4-responsiveness in order to identify the appropriate candidates for this medication.
Our BH4 loading results demonstrated that BH4 responsiveness was mostly associated with milder phenotypes (43/48, 89.58%) except 5 (5/48, 10.42%) BH4 responders being cPKU, and the rate of Phe reduction was largely dependent on phenotypic classes with 28 mPKU responders, 6 MHP responders and 3 cPKU responders reaching normal or nearly normal (≤360 umol/L) Phe levels 24 hours post load. These findings supported the statement that mild phenotypes respond better to BH4 challenge 22 . Therefore, we speculated that half our patients (50.29%, 174/346) might be potential candidates for future pharmacologic intervention with concomitant relief or withdrawal of the burdensome diet therapy. In our group, we noted that there were individuals (8/18, 44.44%) who were previously untreated and in whom a positive response to BH4 was identified. The results of our study, coupled with the findings proposed by Moseley KD 39 , Koch R 40 and Grosse SD 41 , led to the consideration that sapropterin should be an appropriate treatment option for some Chinese patients, and that it might be useful as an adjuvant in maintaining blood Phe levels, thus hindering fluctuations. Inhibiting the fluctuations in blood Phe levels has been recognized as neuro-protective 42 , particularly during infancy, early childhood, and pregnancy. Moreover, sapropterin may allow BH4-responsive patients, who were later diagnosed, to make progress in their development. Our previous work 32 has proven that genetic diagnosis was useful for evaluating the biochemical phenotypes. In this study, we estimated the accuracy of two commonly used genetic diagnostic tests for their statistical prediction of BH4 responsiveness. The sensitivity and specificity relative to the BH4 responsiveness status of the detected mutations in predicting clinical response were 82.9% and 97.3%, respectively, a little higher than those of the AV sum tool, 81.1% and 92.5%. Our results showed that the AV sum approach appeared to provide a lesser degree of sensitivity for identifying patients, which was not consistent with the view proposed by Quirk et al. 43 , who concluded that the AV sum tool was viable for identifying definitive responders with a predictive sensitivity of 89.5%, as derived from their clinical results of a cohort of 58 patients.
Furthermore, we evaluated the effects of the two tests in 346 Chinese patients. Our data identified 31.65% of our cohort as potentially BH4-responsive using the AV sum test and 25.43% by the other. But the overall frequency of BH4-responsiveness in 94 patients who took the loading test was 51.06%, almost two times higher than the predicted data. The following reasons might contribute to this discrepancy. In our study, only p.Arg241Cys (c.721C > T), p.Val388Met (c.1162G > A) and p.Arg261Gln (c.782G > A) were confirmed to be constantly responsive; p.Arg243Gln (c.728G > A) and p.Arg413Pro (c.1238G > C) which were defined as responsive by Zurfluh et al. 22 , were defined as null mutations. Apart from the different criteria for mutation classification due to population-based findings, a unique genotypic feature of high heterogeneity was partly related to it. The mutation spectrum observed in China 33 showed a characteristic of the combination of a small number of common mutations and a very high number of rare ones. Based on our observation, for the most common 8 mutations, p.Arg-241Cys (c.721C > T) was consistently associated with BH4 responsiveness; p.Arg243Gln (c.728G > A) was associated with non-BH4 responsiveness except for 5 patients; p.Ex6-96A > G (c.611A > G), p.[IVS4-1G > A] (c.442-1G > A), and p.Val399Val (c.1197A > T) as splicing mutations, were equally distributed in both categories; while p.Arg413Pro (c.1238G > C), p.Arg111* (c331C > T) and p.[IVS6-1G > A] (c. 707-1G > A) were consistent with their non-BH4 responsiveness in every patient, which might be the reason that the lower "concordance rate" (83.72%) was observed in patients within the expected non-BH4 responsive group. In our study, the patients with the rare mutations were excluded from the AV sum analysis or were assigned to the "Undefined BH4 responsive" group since most rare mutations had not been analyzed in vitro and were of unknown BH4 responsiveness.
In terms of optimizing the use and allocation of public health resources, the BH4 response detection procedure based on genotyping is an imperative step for our population since there is an estimated not-too-low potential incidence of BH4-responsiveness in China, yet the drug used for treatment is of high cost with a limited supply. We will present detailed reasons below: first, a genotype-predicted prevalence of BH4-responsiveness is almost two times lower than the result obtained by loading tests, this may lead to lower false positive results, thus eliminating the need for challenging every patient; second, the information of established BH4 responsiveness or non-responsiveness in single allele and mutational combinations derived from our research allows Chinese patients with those genotypes to avoid prior 24 h BH4 loading tests; third, since up to more than 10% of classic patients responded to loading test, 32.73% of mild ones did not respond, and 19.57% of the non-responders were "slow responsive" based on our data, these patients may need a longer period of time for testing to show observable BH4 long-term efficiency; the response in these patients is a difficult target to evaluate properly and the current BH4 loading test should be optimized accordingly to determine the patients who are feasible subjects for complete diet liberalization or who can increase their intake of natural protein with sapropterin as an adjuvant option.
In this way, the inconsistency of our work and the genetic analysis importance in Chinese patients stressed the need for a population-specialized approach to the evaluation of BH4 responsiveness. As a next step, we aimed to verify the clinical relevance of our personalized procedure (Fig. 3) and to analyze the effects of different sapropterin concentrations, dosages, and durations on the outcome of BH4 loading tests in individuals with different genotypes in a prospective study, which might serve as a way to establish individualized treatment for Chinese PAH deficiency patients.  (Table S1). These patients were found to have two presumably causative mutations in the PAH gene. In all patients, PAH deficiency had been detected by either a national screening (56.94%, 197/346) or by the presence of neurological deterioration observed at an older age (43.06%, 149/346) with a plasma Phe cutoff level of 120 μmol/L. Patients with defect in the synthesis and recycling of BH4 were excluded by analysis of urinary pterins and dihydropteridine reductase activity in erythrocytes. Of these patients, 94 whose informed consents were obtained took part in BH4 loading test (40 males and 54 females, ages ranging from 1 month to 5.6 years; 18 patients presented neurological deterioration at an older age while the others were detected with a national screening). Table S2 showed that there were no significant differences between the demographic characteristics of patients who took the BH4 loading tests and those who did not.

Methods
The metabolic phenotype for each patient was stratified according to the individual's plasma Phe concentration before treatment. All of the Phe levels were maximum values (Phe range of 132-2937 umol/L) during pretreatment. Phe was measured using a fluorometric method until 2003 followed by tandem-mass spectrometry. Patients were classified as cPKU (Phe more than 1200 μmol/L), mPKU (Phe, 360 to 1200 μmol/L), and MHP (Phe less than 360 μmol/L) according to the Chinese HPA consensus updated in 2014 3 .
The Ethics Committee of Xinhua Hospital affiliated with Shanghai Jiao Tong University School of Medicine approved this study and informed consents were subsequently obtained from the parents of the patients enrolled. The methods were performed in accordance with the relevant approved guidelines.
ScIENtIfIc REPORTs | 7: 6762 | DOI:10.1038/s41598-017-06462-y BH4 Loading Test. The BH4 loading tests that were used as the gold standard diagnostic method for BH4 response in this study were performed between 2005 and 2009 following the routine protocol 20 combined with our local regulation 44 . During the entire testing period, patients had no dietary restrictions. Blood Phe was measured by dried blood spot analysis at times 0-2-4-6-8 and 24 hours after oral administration of 20 mg/kg KUVAN ™ (sapropterin dihydrochloride); Biomarin Pharmaceutical Inc. Novato, CA) in 68 patients whose initial plasma Phe results were ≥600 umol/L. For 26 patients whose Phe levels were <600 umol/L, L-Phe (100 mg/kg; Shanghai Pujiang Institute of Applied Biochemistry, China) was given initially, followed by sapropterin (20 mg/kg). Patients' blood samples were taken at times 0-1-2 and 3 hours after Phe loading and at 0-2-4-6-8 and 24 hours after BH4 loading. Responsiveness to BH4 was calculated as a percentage of blood Phe reduction 24 hours after sapropterin administration. A reduction of at least 30% indicated a positive response. No side effects were observed during the test. Plasma Phe concentrations were determined by tandem-mass spectrometry (API 4000 LC/MS/MS System, Shimadzu, Tokyo, Japan).
Mutation Analysis. Genomic DNA was isolated from peripheral blood samples, 13 exons and related intronic boundaries of the PAH gene were amplified. All PCR products were scanned for mutations by direct sequence analysis, most of which were previously done 32,33 . All experiments were conducted according to the standard protocol 45,46 , and the details were described in our previous paper 33 . Mutations were referred to by their description at the DNA level and protein level (http://www.hgvs.org/mutnomen). Since the mutations present in 68 individuals lacked expression data, they were excluded from the analyses that depended on in vitro expression information. For mutation classification in relation to BH4-responsiveness, we used the criteria developed by Zurfluh et al. 22 , supplemented by the data from BIOPKU database and other published papers 21,23 .

Mutations.
A mutation/allele was regarded as BH4-responsive if it appeared either as homozygous or compound heterozygous form associated with a known null mutation in BH4 responders 22 . BH4-responsiveness was expected in patients with a BH4-responsive mutation on at least one PAH gene copy. We assigned each patient to one of three groups according to the following criteria 47 : (1) "Non-BH4 responsive" group: if both alleles of the patient were null mutations; (2) "BH4 responsive" group: if the patient carried a consistent BH4 responsive mutation on at least one allele; (3) "Undefined BH4 responsive" group: a patient with the presence of at least one mutation encoding a protein with known residual activity but having inconsistent or pending information on BH4 response.
Of 346 patients, 88 patients were assigned to the "BH4 responsive" group, 189 to the "Non-BH4 responsive" group, and the remaining 69 to the "Undefined BH4 responsive" group. We thus attempted to predict the BH4 responsiveness of 346 patients from the respective groups that were assigned based on the information of the single allele.

Determining the PAH Genotype Severity Using the AV Sum and Classification of BH4
Response. We used an arbitrary assigned value (AV) approach stated by Guldberg et al. 34 to assess the PAH genotype severity of our patients, and the resulting predicted phenotypes from a combination of the two mutant alleles were represented as the sum of the two mutations' AVs in 278 patients (data were shown in our previous paper) 32 . The ideal cutoff level of AV sum used for predicting the response was determined by the receiver operating characteristic (ROC) curve analysis. Of 278 patients, 88 with an AV sum >2 were classified as "AV sum responders" and 190 with an AV sum = 2 were classified as "AV sum non-responders".
We investigated the prevalence of BH4 response in those 278 patients who had an AV sum data.

Assessing the Diagnostic Value of Both Classification Approaches in Predicting BH4
Response. The ability of the AV sum to differentiate the patient's clinical BH4 response status was validated in 77 BH4-challenged patients having AV estimates for both mutations. Those patients were simultaneously dichotomized into "BH4 responder" and "non-BH4 responder" groups based on their clinical response classification derived from the BH4 loading test and their AV sum. In order to quantify the ability of AV sum to classify BH4 response, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. The ROC curve is a binary tool with five degrees of rating: Excellent (0.9 to 1), good (0.8 to 0.9), fair (0.7 to 0.8), poor (0.6 to 0.7), and not discriminating (0.5 to 0.6) 48 . The ability of the detected mutations' BH4 responsiveness-status to discriminate the patient's clinical BH4 responses status was confirmed in 94 patients loaded with BH4. The patients were designated to one of the three groups (mentioned in Methods: assigning groups based on the assessment of the BH4-responsiveness-status of the detected mutations). The clinical distribution was based on the BH4 loading test. Sensitivity, specificity, PPV, and NPV were calculated to evaluate performance relevant to clinical management.
Statistical Analysis. All statistical analyses were performed using SPSS 17.0 (SPSS Inc., Chicago, Illinois).
Initially, all data were analyzed using the Kolmogorov-Smirnov test to assess whether the data were normally distributed. Quantitative data were reported as mean ± standard deviation; normally distributed parameters were compared using the ANOV and non-normally distributed parameters were compared using the Mann-Whitney U test. Qualitative data were compared using the chi-square test; a p-value < 0.05 was considered statistically significant.