Characterization of rare ABCC8 variants identified in Spanish pulmonary arterial hypertension patients

Pulmonary Arterial Hypertension (PAH) is a rare and fatal disease where knowledge about its genetic basis continues to increase. In this study, we used targeted panel sequencing in a cohort of 624 adult and pediatric patients from the Spanish PAH registry. We identified 11 rare variants in the ATP-binding Cassette subfamily C member 8 (ABCC8) gene, most of them with splicing alteration predictions. One patient also carried another variant in SMAD1 gene (c.27delinsGTAAAG). We performed an ABCC8 in vitro biochemical analyses using hybrid minigenes to confirm the correct mRNA processing of 3 missense variants (c.211C > T p.His71Tyr, c.298G > A p.Glu100Lys and c.1429G > A p.Val477Met) and the skipping of exon 27 in the novel splicing variant c.3394G > A. Finally, we used structural protein information to further assess the pathogenicity of the variants. The results showed 11 novel changes in ABCC8 and 1 in SMAD1 present in PAH patients. After in silico and in vitro biochemical analyses, we classified 2 as pathogenic (c.3288_3289del and c.3394G > A), 6 as likely pathogenic (c.211C > T, c.1429G > A, c.1643C > T, c.2422C > A, c.2694 + 1G > A, c.3976G > A and SMAD1 c.27delinsGTAAAG) and 3 as Variants of Uncertain Significance (c.298G > A, c.2176G > A and c.3238G > A). In all, we show that coupling in silico tools with in vitro biochemical studies can improve the classification of genetic variants.


Results
Description of the cohort. In this study, a total of 744 individuals were analyzed, including 624 patients (579 adults and 45 pediatric) and 120 first degree relatives (from trios and cosegregration studies of positive cases). Baseline characteristics of this cohort are provided in Supplementary Table 1. ABCC8 variants were identified in 11 patients (1.76% of the patients analyzed). The mean age at diagnosis of these 11 patients was 34 ± 10 years whereas 6 of them were female (54%). Patients 3, 4, 6, 8 and 11 were diagnosed with Idiopathic Pulmonary Arterial Hypertension and patient 4 had a familial form. Patient 7 had PAH associated to Systemic Sclerosis. Patient 10 had a ventricular septal defect, which was repaired at 40 years of age. Patient 2 was clinically diagnosed with possible PVOD associated with systemic sclerosis and HIV infection; PH was diagnosed at 57 years of age. The patient fulfilled all clinical and radiological criteria, so the patient was initially diagnosed with PVOD. DLCO at diagnosis was 22% of predicted value. CT scan showed the typical triad of PVOD: ground glass opacities, interlobular septal thickening and mediastinal lymphadenopathies. The patient also had respiratory insufficiency at rest and significant drop of oxygen desaturation during exercise. Patient 11 was diagnosed at 24 years of age. In the initial diagnosis work-up, restrictive ventricular septal defect was observed. Suprasystemic PAH and right to left shunt were confirmed in the RHC. Considering these findings, ventricular septal defect closure was contraindicated and the patient is currently receiving systemic prostanoids.
Most of the patients showed a Functional Class III (64%) at diagnosis and the mean distance in 6 min' walk test was 400 ± 120 m. At the moment RHC was performed; Cardiac Index was 2.7 ± 0.7 l/min/m 2 , mean Pulmonary Artery Pressure 58.8 ± 21.8 mmHg, Right Atrium Pressure 7 ± 3 mmHg and Pulmonary Vascular Resistance 10.8 ± 4.7. All the clinical variables are disclosed in Table 1.
Mutational screening. After  Several of these variants have been described in other pathologies. The variant c.298G > A p.(Glu100Lys) was described in Maturity Onset Diabetes of the Young 25 . Variants c.2176G > A p.(Ala726Thr), c.3976G > A p.(Glu1326Lys) and c.2694 + 1G > A were found in congenital hyperinsulinism patients [26][27][28] . Lastly, c.2422C > A p.(Gln808Lys) has been described in a diabetes mellitus type 1 patient 27 . For the rest of the variants no relation with any disease has been found, c.211C > Tp.(His71Tyr) and c.3288_3289del p.(His1097ProfsTer16) are described here for the first time. The rs identification of all the variants can be found in Supplementary Table 2 Only two families decided to undertake cascade testing. Patient 4 (carrying p.Thr548Met) inherited the variant from his mother (unaffected carrier), his father was WT. In the case of patient 7 (carrying c.2694 + 1G > A), her mother was WT and her father was deceased years before the enrollment in the registry, so the variant was either de novo or inherited from her father.
Variant in silico analysis and classification. The ACMG prediction score was used to determine the grade of evidence to support the pathogenicity (PP3) and benignity (BP4) of the variants: BP4 (0-2), Variant of Uncertain Significance (VUS) (2-4) and PP3 (4)(5)(6)(7). In the case of frameshift or exon skipping mutations (PVS1) was stated. A summary of all the predictions is detailed in Tables 2 and 3 The splicing effect predictions were taken into account to decide whether or not experimental confirmations needs to be performed, scores are detailed in Table 3. The variants where 2 predictors showed alterations were automatically considered candidates. Exceptions were made for c.298G > A p.(Glu100Lys) due to its location within an intron-exon junction, c.211C > T p.(His71Tyr) because of the possible partial loss of the exon, and c.1643C > T p.(Thr548Met) that was chosen to assess a missense "control" variant in the assay.
Overall, most of the variants were conserved among the evolution, c.211C > T p.   II  III  III  III  III  III  III  III  II  II Table 3.
We also used the MaxEntScan tool from Human Splicing Finder to measure the strength of the splice sites encoding the exons used in the minigenes. We did not find a correlation between the predicted strength of the site and the expression of the minigene construct (Supplementary Table 4).   3E) and its stability (Fig. 3G). All the missense variants, both confirmed and unconfirmed, were compared in detail (Fig. 3H). Concerning the stability, the variations accounted were of less than ± 500 between them and the wild type. We then analyzed all the missense variants with Missense3D, which showed no structural alterations caused by the amino acid substitutions. All the homology models generated by Phyre2 showed less stability than the template used for its modeling ( Supplementary Fig. 2), as expected when comparing an experimentally obtained template and simulated (modeled) data.

Discussion
PAH is a devastating disease if it is not treated. During the last decade the application of high throughput sequencing has increased the knowledge on the genetic basis of PAH, becoming a strong tool with high potential to find out novel treatments 30 .
In this study we identified by a custom NGS targeted sequencing panel of 21 genes (HAP v1.2), eleven variants in ABCC8 gene after screening 624 patients from the Spanish PAH Registry (REHAP).
This set of patients with ABCC8 variants compose a 1.76% of our cohort and show a younger age of diagnosis when compared with other recent studies 9,31 . Also, they seem to show more severe forms of the disease with higher PAP values and FC, but this may be due to the small number of patients in our cohort referring variants in this gene. Regarding the etiologies, this set is composed mostly by IPAH patients, with two patients of CHD-APAH and single patients of HPAH, CTD-APAH and, for the first time, an associated form of PVOD.
None of them show any clinical feature related to congenital hyperinsulism as mutations in the ABCC8 gene have been described in this disorder [23][24][25] .
Most of the detected ABCC8 variants were located in regions within or near exon-intron boundaries so the possible effects in the splicing had to be taken into account. After using pathogenicity and splicing predictors, we tested nine variants through minigene assay to determine the effect in splicing at mRNA level. Only four of the constructs worked as expected, confirming the correct processing of c.211C > T p. (His71Tyr), c.298G > A p.(Glu100Lys) and c.1643C > T p.(Thr548Met) located in exons 2, 3 and 9, respectively. The variant c.3394G > A www.nature.com/scientificreports/ p.(Asp1132Asn) causes an exon skipping of exon 27 of ABCC8, so we propose to change its nomenclature at RNA level from r.3394G > A to r.del3330_3399 and at protein level from p.(Asp1132Asn) to p.?. Unfortunately, constructs did not work in the five remaining variants; therefore, the possible splicing effect for these variants remains unknown. This may be due to neither the wild type nor the mutated exons were transcribed in our constructs. As a first approach, we tried to perform the minigene assay using standard length minigene constructs. With a fragment containing the exon of interest and, at least, 100-200 bp around 3′ and 5′, this approach allowed us to detect the exon 27 skipping, but none of the other constructs transcribed the exons, so we decided to use longer constructs with 800-1,500 bp around including multiple exons. With this approach we were able to confirm that c.211C > T p. (His71Tyr) Supplementary Fig. 1). One of them has binding sites for up to 40 transcription factors. Length of this regions are variable, ranging from less than 200 up to 2,000 bp, and strikingly, the constructs that worked (exons 2, 3, 9 and 10) did not have any regulatory element nearby, but exon 27 was right in the middle of one, and apparently 351 bp of it (including the exon) were enough to allow it to be transcribed. Furthermore, in the non-working constructs we had most of the regulatory region within exon 11 (GH11J017432); in the case of exon 20 and 21, the 1,200 bp insert included 800 bp of the regulatory region encoding exon 20 (GH11J017412); for exon 22, we missed the region GH11J017411 by 250 bp, this region is only 151 bp long; in the case of the construct 24-25-26 we had 500 bp of the GH11J017404 which encodes the end exon 25 and the whole exon 26; and lastly, the construct with exons 31-32-33 had its closest regulatory region 3,000 bp upstream of exon 31. The regulation by upstream regions and the need to use clusters of exons is not something new 32 , but it arises the limitations of testing splicing variants in vitro using minigenes instead of specific tissues, the larger the gene the more complicated its regulation will be as we can see in ABCC8. Thus, we would need to access specific tissue in order to properly test the pathogenicity of the resulting splicing variants.
In order to facilitate the in silico analysis, due to the unavailability of patient's specific tissue, we chose a bioinformatics approach using protein homology modeling and protein stability analyses. We modeled the effect of the missense changes and the confirmed skipping variants. These results highlight how exon skipping www.nature.com/scientificreports/ events in most of the situations would yield an unstable protein that may be degraded, unable to fold correctly and probably non-functional. All the amino acid changes simulated did not affect the overall structure and stability of the protein, but some of them are located in functional positions. p.(His71Tyr) and p.(Glu100Lys) are located in the region that interacts closely with the gating of the K ir 6.2 pore domain 33 . It is likely that additional changes affecting NBD1 (p.Ala725Thr, p.Gln808Lys) and NBD2 (p.Glu1326Lys) alter the catalytic activity of SUR1, and therefore, its role  www.nature.com/scientificreports/ yields the complete loss of the allele. Mutations in SMAD1 have been related to PAH, and functional studies have confirmed the decrease in signaling due to missense variants 34 , but the exact molecular mechanism that leads to PAH remains unknown 35 . Thus, we propose that PAH in this patient may be caused by the combination of both variants, where one of them could act as a 2nd hit needed to start PAH pathogenesis. Bohnen et al. 9 have already shown that amino acid substitutions affect the correct electrophysiological function of the SUR1-K ir 6.2 complex. Their study is performed using cDNA constructs encoding the variants to test. This means that all the variants analyzed will be treated as missense, and one would be losing all the effects from the possible alterations of the splicing process. Also, mRNA processing can play a vital role in the amount of mutant protein that is traduced; the mutated alleles can be degraded reducing the theoretical 50-50%.
Furthermore, the assembly of the SUR1-K ir 6.2 complex should be random, yielding channels with 1, 2, 3 or 4 SUR1 wild type or mutated proteins (if not degraded) making almost compulsory to carry out several measurements of activity by single channel patch clamp to be able to appreciate the real effect. Using cell culture to validate novel PAH related genes is useful, but animal models are still needed to fully associate loss of function in ABCC8 to PAH. Until now, none of the mice models referring ABCC8 mutations or its complete gene knockout have been described to develop PAH according to Mouse Genetic Information data, weakening the link between the gene and the disease. ABCC8 expression pattern does not exactly fit with PAH known pathogenesis, as we would expect SUR1 to be highly expressed in cardiac and smooth muscle cells, but this is not the case, those tissues express much more the closely related SUR2 in combination with both K ir 6.1 and K ir 6.2 36 . SUR1 is mainly expressed in pancreas, where its link to congenital hyperinsulinism 37 and neonatal diabetes 38 has been proven many years ago.
However, SUR1 is detectable in lung, proximal pulmonary arteries 9 and heart atrium 39 . Therefore, the only way to link the loss of function in ABCC8 with PAH is if it induces a vasoconstrictor effect similar to hypoxia. In hypoxia, K ATP channels are inhibited in the pulmonary smooth muscle cells, inducing Ca 2+ uptake and thus, an increase in contraction and proliferation that could start PH 18 . This would end up increasing shear stress, inducing endothelial cells to proliferate 40,41 .
In conclusion, we report eleven novel changes in ABCC8 gene found in PAH patients. Thanks to in vitro biochemical analyses and in silico tools we were able to classify them according to ACMG: two as pathogenic, six as likely pathogenic and three as VUS (Table 4).

Materials and methods
Description of the cohort. Since November 2011, genetic studies have been offered to all patients included in the Spanish Registry of Pulmonary Arterial Hypertension (REHAP) with idiopathic and hereditable forms of PAH, and PVOD 2,30 . All the methods were performed in accordance with the ethical principles of the European Board of Medical Genetics and the 2015 ERS/ESC guidelines for the diagnosis and treatment of pulmonary hypertension to provide accurate information on the range of options available to make informed decisions and allow equal access to genetic counseling and testing 1 . Pre and post-test genetic counseling was provided. All patients or legal tutors included in the analysis gave their informed consent. The project was approved by the ethical committee for scientific research of the Hospital Universitario 12 de Octubre (Comité Ético de Investigación Clínica-Hospital 12 de Octubre).
First degree relatives to affected probands were clinically screened when available. Cascade or cosegregration genetic tests were also offered. When an unaffected carrier was identified, a complete diagnostic work-up was performed, including electrocardiogram, echocardiogram, N-terminal pro-brain natriuretic peptide (NT-proBNP) and 6 min' Walk Test. Due to the typical phenotype, DLCO was also determined in healthy carriers. This evaluation is periodically repeated. When a sustained suspicion of early stage PAH was observed, right heart catheterization (RHC) was performed to rule out the condition.  Regarding of the effect assessed by the predictors, we assigned each variant a prediction score with a maximum of 7 points following this criteria: 1 = "Damaging/High", 0.5 = "Possibly damaging/Moderate" and 0 = "Benign", the score was used to sum up the level of evidence from the predictors in order to classify the variants following the ACMG guidelines. Because most of the variants were located near or within exon-intron junctions, the following splice effect predictors were also used: Alamut Interactive Software (Interactive Biosoftware, Rouen, France), NetGene2 52 , NNSplice 53 and Human Splicing Finder 54 . We then compared splicing prediction data with our minigene assay results in the cases we could. The conservation of amino acids or nucleotides was represented as sequence logos using data from the ABCC8 Ensembl reference sequences of 14 different species (human, chimp, mouse, rat, cat, dolphin, cow, elephant, alpaca, platypus, chicken, zebrafish, coelacanth and lamprey) with the R package ggseqlogos 55 .

Minigene design.
For the minigene assay we followed two different approaches. First, we amplified fragments containing the exon of interest and at least 80 bp of each flanking intronic regions. Due to the inconclusive results yielded by the standard constructs (lack of ABCC8 exons transcription in both wild type and mutated states), we increased the length of the fragments to a range between 1-1.5 kb, including multiple exons in the same construct. All the primers used for the PCR amplification are available in Supplementary Site directed mutagenesis. The mutated constructs for each of the variants were generated using the NZYmut site directed mutagenesis kit (NZYtech, Caparica, Portugal). We carried out the reaction using 100 ng of the wild type plasmid following manufacturer's protocol. All the primer sequences used for the reactions are indicated in Supplementary www.nature.com/scientificreports/ were separated by electrophoresis in a 2% agarose gel to analyze changes in pSPL3 transcript pattern, to confirm splicing alterations, we sequenced the bands using BigDye Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher, Waltham, USA). All the working constructs have been deposited in Addgene (Watertown, USA) pSPL3-ABCC8-2 (#135916), pSPL3-ABCC8-3 (#135917), pSPL3-ABCC8-9-10 (#135918) and pSPL3-ABCC8-27 (#135919).
ABCC8 regulation analysis. We evaluated the gene regulation network and binding motifs sequences using the UCSC genome browser 56 .
Protein modeling and protein stability analyses. We generated homology models for the different SUR1 variants with Phyre2 57 . The models were then superposed with the wild type SUR1 in Chimera 58 .
For the protein stability analysis, the models were built considering the protein structure C63O downloaded from the Protein Data Bank 22 as a template (indicated by Phyre2). Next, we obtained the discrete optimized protein energy (DOPE), which provides a measure of protein folding stability 59 and that is implemented in MODELLER 60 . When the predictors could not rule out possible splicing alterations, we simulated the effect of a whole exon skipping in the sequence (and the amino acid substitution or frameshift in the protein structure). The effects in the protein sequence were predicted by editing the reference sequence in Snapgene viewer (GSL Biotech, Chicago, USA). For the missense variants, we used the Missense3D server 61 to evaluate changes in protein conformation caused by amino acid substitution events.

Data availability
Data supporting the findings of this study are available on request from the corresponding author. The data is not publicly available due to privacy or ethical restrictions.