The use of targeted exome sequencing in genetic diagnosis of young patients with severe hypercholesterolemia

Familial hypercholesterolemia (FH) is an autosomal dominant disorder. Although genetic testing is an important tool for detecting FH-causing mutations in patients, diagnostic methods for young patients with severe hypercholesterolemia are understudied. This study compares the target exome sequencing (TES) technique with the DNA resequencing array technique on young patients with severe hypercholesterolemia. A total of 20 unrelated patients (mean age 14.8 years) with total cholesterol > 10 mmol/L were included. 12 patient samples were processed by DNA resequencing array, 14 patient samples were processed by TES, and 6 patient samples were processed by both methods. Functional characterization of novel mutations was performed by flow cytometry. The mutation detection rate (MDR) of DNA resequencing array was 75%, while the MDR of TES was 100%. A total of 27 different mutations in the LDLR were identified, including 3 novel mutations and 8 mutations with previously unknown pathogenicity. Functional characterization of c.673delA, c.1363delC, p.Leu575Phe and p.Leu582Phe variants found that all of them are pathogenic. Additionally, 7 patients were diagnosed with Heterozygous FH (HeFH) in which lipid levels were significantly higher than common HeFH patients. This data indicates that TES is a very efficient tool for genetic diagnosis in young patients with severe hypercholesterolemia.

most countries 4 . Recently, a consensus statement of FH experts emphasized the need for a global effort to generate and integrate data to better manage FH because of the challenge it poses to global public health 5 .
Conventional genetic diagnosis of FH is primarily based on direct DNA sequencing or a combination of direct sequencing with use of the multiplex ligation-dependent probe amplification (MLPA), detects large insertion or deletion mutations 6 . Because of the high cost of both these methods, alternative DNA resequencing assays have been developed to optimize the detection of FH-causing mutations 7 . In recent years, next-generation sequencing has also been successfully used to conduct screens for FH-causing mutations 8,9 . We previously showed that target exome sequencing (TES) successfully detected LDLR mutations in a FH patient 10 . The aim of the present study is to compare the TES technique with the DNA resequencing array technique in the genetic diagnosis of young patients with severe hypercholesterolemia. We found that TES-based screenings yields a more accurate genetic diagnosis and we conclude that TES is the more appropriate diagnostic technique for young patients with severe hypercholesterolemia.

Study subjects.
A total of 20 probands were included in our study. The mean age of the probands was 14.8 ± 8.8 years. 10 males and 10 females were included in our study. About 60% probands were from South China. All the probands presented xanthoma and 85% had corneal arcus. Clinical and biochemical characteristics of the probands are shown in Table 1. The mean TC and LDL-C levels were 16.1 ± 2.9 mmol/L and 13.2 ± 2.4 mmol/L, respectively. Color Doppler ultrasound showed that half of the probands had regurgitation of mitral valve and 55% of the probands had regurgitation of aortic valve. The mean IMT of ten probands was 0.18 cm and the mean CVFR of 12 probands was 2.19. Nearly 90% of the probands had statin treatment and only one proband had no lipid lowering treatment because of parental refusal.
Mutation detection rates. 20 proband DNA samples were analyzed by DNA resequencing array and target exon sequencing (Fig. 1). A total of 27 mutations in LDLR gene were identified (Table S1). There was no mutation found by DNA resequencing array in proband 11, even though a c.1187-10 G> A mutation was detected by the whole-exome sequencing method in a previous study 11 . In the DNA array group, a total of 16 mutations were identified across the 12 probands tested with DNA array. However, the DNA resequencing array method only detected 12 mutations in these probands, yielding a detection rate of 75%. In TES group, all 23 mutations were detected across 14 probands, with a detection rate of 100%. For the 6 proband DNA samples studied by both methods, DNA resequencing array correctly detected 72.7% (8/11) of mutations, while TES correctly detected 100% (11/11) of mutations ( Table 2).

Spectrum of FH-related mutations.
Overall, 27 different LDLR mutations were found across the 20 probands (Table 2). In Proband 12, only one SNP (1060 + 10 G > C in intron 7, rs12710260) was detected. No mutations in apoB and PCSK9 genes were found in any of the probands. From the 27 mutations detected, there were 19 missense mutations (70.4%), 2 nonsense mutations (7.4%), 2 single base deletion mutations (7.4%), 3 splicing-site mutations (11.1%), and 1 synonymous mutation (3.7%). Most mutations were located in either exon 4 (7/27, 25.9%) or exon 12 (5/27, 18.5%). The most common mutation was W483X (n = 6, 22.2%), which is located in the tenth exon of the LDLR gene, in which a G-to-A substitution at nucleotide 1448 introduces a premature termination codon leading to a truncated LDLR. The second most common mutation detected was A627T (n = 2, 7.4%), which is located in the thirteenth exon encoding the EGF precursor homology domain of LDLR. It has been previously shown that LDLR A627T is a pathogenic variant that accumulates inside the cell and is unable to enter the plasma membrane 12 .
Novel mutations and mutations with unknown pathogenicity found in the LDLR gene. We found a total of 3 novel mutations in LDLR and 8 previously described mutations with unknown pathogenicity (Table S2). The 3 novel mutations were: c.673delA, p.Leu575Phe, and p.Leu582Phe. The mutation c.673delA causes a frame shift, leading to a premature stop codon 40 amino acids after the deletion site. In silico analysis predicted this mutation to be pathogenic (Table S2). In silico analysis of p.Leu575Phe, and p.Leu582Phe variants showed some discrepancies depending on the software used. While align GVGD predicts p.Leu575Phe and p.Le-u582Phe to be probably damaging, the other in silico software predicts both as pathogenic. The 8 previously reported mutations with unknown pathogenicity found in this study were: p.Asp172His, p.Cys204Tyr, p.Cys222Phe, c.818-2 A > G, p.Arg416Trp, c.1363delC, p.Cys711Tyr, and p.Asn825Lys. In silico analysis predicted all of these mutations to be pathogenic. Proband carrying the p.Asn825Lys variant showed the highest CADD value of 21.3; the lowest CADD value (7.147) was found in the proband carrying p.Cys222Phe.
Functional studies. We experimentally characterized functionality in the 3 novel LDLR variants identified in this study (c.673delA, p.Leu575Phe, and p.Leu582Phe) and in c.1363delC, a novel variant previously found by our group 13 . A total of five plasmids carrying EGFP were constructed (WT, c.673delA, c.1363delC, p.Leu575Phe and p.Leu582Phe) as described in Materials and Methods. LDLR expression was assayed by immunoblotting and, as shown in Fig. 2A (upper panel), c.673delA, c.1363delC variants are not expressed in transfected cells while expression of mature p.Leu575Phe and p.Leu582Phe variants is lower than WT LDLR. Equal loading of protein was confirmed in each blot by membrane stripping and further incubation with antibodies to visualize cytosolic GAPDH protein ( Fig. 2A, lower panel).
Analysis by cytometry was performed to determine activity of these variants, examples of FACS raw results are shown in Fig. 2B-D. We found that c.673delA and c.1363delC resulted in diminished expression (approximately 65% of WT). To a lesser extent, LDLR expression in p.Leu575Phe, and p.Leu582Phe variants was also diminished (Fig. 2E). Concurrently, LDL binding and uptake were both also significantly diminished in all 4 variants, and more markedly in c.673delA and c.1363delC variants (Fig. 2E).
Relationship between genotype and phenotype. A cascade screening in 18 families including a total of 71 patients was performed and the obtained data is shown in Table 3. Among them, genetic diagnosis indicates that there are 4 HoFH patients, 14 compound heterozygous patients, 52 HeFH patients (included 7 HeFH with high TC) and 1 negative mutation patient. The average age of HoFH and compound heterozygous patients was lower than the average age of HeFH patients. However, the values of the lipid profile in HeFH patients (TC, LDL-C, and non-HDL-c) were lower than those determined in HoFH and compound heterozygous patients (Table 3). Lipid profile comparision between HoFH and compound heterozygous patients showed no significant differences between the two groups. Interestingly, 7 of the patients diagnosed with HeFH had significantly higher lipid levels than the other HeFH patients. These seven patients had higher TC, LDL-C and non-HDL-C levels than the remaining patients in the HeFH group. Based on the different genotypes of HeFH found in this study, a comparison analysis of LDL-C was assayed between three different groups: patients with missense mutations, patients with nonsense mutations, and patients with splicing-site mutations. Lipid analysis found that patients with nonsense mutations had significant lower LDL-C level than patients with missense mutations. There were no significant differences between the levels of the other lipids that were analyzed (Table S3).

Discussion
In this study, we determined that target exome sequencing is a powerful tool for genetic diagnosis of young patients with severe hypercholesterolemia. We used both DNA resequencing array and target exon sequencing and found 27 LDLR mutations. Among them, 3 were novel mutations and 8 were previously described LDLR mutations with unknown pathogenicity. It is remarkable that TES technology showed a higher mutation detection rate than DNA resequencing array while also being a faster screening technique for patients with severe hypercholesterolemia.
The demand for molecular diagnosis leading to more personalized treatment of genetic diseases has increased the necessity of developing more accurate, cheaper and faster technologies to obtain genetic information. This challenge has triggered an improvement in the development of next-generation sequencing technologies (NGS). In the case of FH, Rios et al. first used whole-genome resequencing to perform a genetic diagnosis in an 11-month-old female with severe hypercholesterolemia and found that the patient carried two nonsense mutations in the ABCG5 gene 14 . Since then, whole exome sequencing has been used to detect mutations in FH patients, but this technology has not yet found any FH-causing mutations in genes other than LDLR, APOB or PCSK9 15,16 . Target exome sequencing in combination with clinical criteria now has a high success rate in the genetic diagnosis of FH patients with hyperlipidemia 8,[17][18][19] . TES is also cheaper and more efficient than other genetic diagnostic technologies. Despite these advantages, few studies have focused on screening patients with severe hypercholesterolemia. The primary challenge in studying severe hypercholesterolaemia is the extremely low prevalence of HoFH, affecting approximately 1 in 160,000 people around the world 20 .
In the general population both HoFH and HeFH is significantly underdiagnosed. Therefore, there is a pressing clinical incentive to develop effective screening methods 4,20 for the early detection of FH. In a previous study, we used whole exome sequencing and TES to identify patients with HoFH and found that these technologies are useful for providing an accurate genetic diagnosis in patients with severe hypercholesterolemia 10,11 . In the present work, we have analyzed 20 young patients with severe hypercholesterolemia by either array sequencing or TES. The results show that 95% are homozygous or compound heterozygous FH patients. Among them, there were 6 probands in which the genetic analysis was performed by both methods. DNA array sequencing had an overall mutation detection rate of 72.7% (8/11), while the mutation detection rate of TES was 100% (11/11). Future large cohort studies should confirm the higher mutation-detecting efficiency of TES in severe hypercholesterolemia patients. In addition to characterizing the efficiency of TES, our study also identified 3 novel LDLR mutations and 8 mutations with previously unknown pathogenicity. Across all probands, 27 LDLR mutations were identified. In silico analysis predicted all of these mutations to be pathogenic. We experimentally characterized LDLR activity in the 3 novel LDLR variants (c.673delA, p.Leu575Phe and p.Leu582Phe) and one variant previously described by our group (c.1363delC). We found that all four variants are pathogenic and result in reduced LDLR expression and LDL uptake.
Through cascade screening, we then found 52 relative patients who also carried LDLR mutations. Based on their genotype, all patients were divided into three groups: HoFH, compound heterozygous and HeFH patients. In agreement with other studies, the HeFH group showed lower TC, LDL-C and non-HDL-C level than other two groups. Interestingly, in the HeFH group we found 7 patients with significantly higher levels of TC, LDL-C and non-HDL-C. One possible explanation for this finding is that LDLR activity in this subset of patients is also being modulated by epigenetic factors. A recent study found that DNA methylation plays a significant role in the clinical heterogeneity of FH patients because epigenetic perturbations of key lipoprotein metabolism genes affect plasma lipid levels in these patients 21 . We further divided HeFH patients into 3 subgroups based on the type of mutation present: missense, nonsense, and splicing-site mutations. There was only one significant difference in lipids levels among the three groups. LDL-C levels in the nonsense mutations group were significantly lower than those in the missense mutations group. There are several possible explanations for this finding: First, the sample size of this study is not big enough and needs more power to represent the entire cohort. Second, these patients may carry a disruptive-missense mutation that could influence clinical results. Previous studies have identified a class of LDLR disruptive-missense mutations that significantly elevate LDL-C levels 22 . Third, environmental factors, such as lifestyles or diet could also have had an influence on these results.
Despite the constant improvement of diagnostic tools, no causative major gene mutation has been identified in about half of all FH patients 23 . It is possible that FH has a polygenic origin for many of these patients. In 2013, Talmud et al. first used 12 LDL-C relative SNPs to set up a gene score which can distinguish patients with polygenic and monogenic FH from a case-control study 23 . Recently, Futema et al. used ROC curves to determine the optimum number of LDL-C SNPs and refined the genetic risk score using only 6 LDL-C SNPs to diagnose polygenic FH in samples 24 . In this study, we found one proband with a homozygous clinical phenotype without any mutation in LDLR, apoB100 or PCSK9. Accordingly, this patient may be a carrier of polygenic hypercholesterolemia.
In conclusion, we used TES to perform genetic diagnosis of young patients with severe hypercholesterolemia and found 27 LDLR mutations. Among them, 3 novel LDLR mutations were found and functionally characterized as pathogenic. In addition, TES technology showed the highest mutation detection rates. The results indicate that TES is a very efficient tool for genetic diagnosis in young patients with severe hypercholesterolemia.

Materials and Methods
Study subjects. A total of 20 unrelated patients with total cholesterol > 10 mmol/L were included in the study. Subjects attended the atherosclerosis clinic at the Anzhen hospital, Beijing, China, during the period 2003-2015. DNA samples from 12 patients were processed using DNA resequencing array 7 , samples from 14 patients were processed by targeted exome sequencing 10 , and samples from 6 patients were processed by both methods. Some probands were reported in previous studies 11,13,25,26 . This study was approved by the Research Ethics Committee of Beijing Anzhen Hospital in China. Written informed consent was obtained from all participants. The study was performed in accordance with approved guidelines.
Data collection. Basic clinical data (included age, gender, family history and treatment) were recorded when patients were first admitted to the hospital. Peripheral venous blood samples were withdrawn after a 12 h fasting period. TC, LDL-c, triglycerides (TG) and high-density lipoprotein cholesterol (HDL-C) were measured using routine commercial kits (Beckman Coulter, Brea, USA) and an automated biochemistry analyzer (Beckman AU 4500, Brea, USA) 27 . Blood samples after lipid lowering treatment were also recorded for future evaluation. DNA extraction and sequencing. DNA was extracted by using the phenol-chloroform centrifugation method. The exon-by-exon sequencing method was used, and the Touchdown Polymerase chain reaction (PCR) method developed in our laboratory was used to process the PCR 25 . The presence of mutations in the LDLR gene, PCSK9 gene and the R3500Q mutation in the APOB gene was determined. All of the 18 exons and exon-intron boundaries of the LDLR (GeneBank accession number: NM_000527.4), the 12 exon and exon-intron boundaries of the PCSK9 gene (GeneBank accession number: NM_174936), and exon 26 and exon 29 of APOB were analyzed 27 .
Target exome sequencing. A total of 167 lipid related genes which included genes associated blood lipids and bile acid metabolism were chosen by a previously reported gene capture strategy using a GenCap Custom Enrichment Kit (MyGenostics, Beijing, China) 10

Cell culture and transfection for Western blot analysis of LDLR variants. A plasmid contain-
ing WT-LDLR tagged at the N-terminal with enhanced green fluorescent protein (EGFP) was constructed as previously described 27  In vitro functional characterization of LDLR variants. Human embryonic kidney 293 (HEK-293) cells were transiently transfected with the corresponding plasmids using Lipofectamine ® 3000 (Invitrogen, USA).
LDLR functional studies were then performed 48 h after transfection. Flow cytometry (BD, USA) was used to determine the amount of cell-surface LDLR expression, LDL binding and internalization activities. Phycoerythrin (PE)-conjugated mouse monoclonal anti-human LDLR (1:20, R&D) antibodies were used to assay LDLR expression, and 20 μ g/ml Dil-LDL (Molecular Probes, USA) was used to determine LDL binding and internalization 27 . Flow cytometry results (expression, LDL binding and LDL uptake) have been relativized to GFP signal that is used as an internal control of transfection efficiency.
Statistical analysis. All measurements were performed at least three times in LDLR functional assay studies. Results are presented as mean ± SD. A P-value < 0.05 was considered statistically significant. For normally distributed continuous characteristics, the data are presented as the mean ± SD. A Student's t test was used to analyze the mean values of quantitative variables. A χ 2 test was used to analyze frequencies of qualitative variables. SPSS software (version 18.0 for Windows; SPSS, IBM) was used for statistical analysis.