Introduction

Retinitis pigmentosa (RP; MIM# 268000) is the most frequent form of inherited retinal dystrophy (IRD), with a prevalence of 1 in 3000–4000 cases worldwide1. It is characterised by a progressive dysfunction associated with the death of rods and/or cones, which leads to retinal atrophy and loss of vision. The mode of inheritance of RP is complex, with autosomal dominant (ad), autosomal recessive (ar), X-linked (xl) Mendelian cases and some cases of digenism or mitochondrial forms having been reported1,2,3. From a genetic perspective, over 80 disease-causing genes are currently associated with RP, 27 of which have been associated with adRP (http://www.sph.uth.tmc.edu/retnet). However, to date, mutations in the known adRP genes account for only 50–75% of dominant cases, depending on the test and population used in the study4. This percentage is increasing, mainly due to the implementation of Next Generation Sequencing (NGS)-based techniques5,6,7 and the discovery of new RP genes8,9,10,11.

Most human genes harbour introns that are removed during pre-mRNA splicing post-transcriptional modification12. The splicing reaction is catalysed by the spliceosome, a multisubunit complex comprising small noncoding nuclear RNAs (U1, U2, U4, U5, and U6) and several associated proteins13. The spliceosome orchestrates the two transesterification reactions needed to remove introns and to join the adjacent exons, and operates by step-wise formation of sub-complexes that recognise regulatory sequences and promote efficient splicing12,13,14.

Mis-regulation of splicing is a common feature of many human diseases, including several retinal diseases15,16,17,18. These disorders can be caused by mutations that disrupt the splicing of specific genes or by mutations in genes coding for splicing factors, both of which lead to a general loss of spliceosomal function. Thousands of splice-site mutations have been identified in patients with retinal dystrophies. Although most of these mutations disrupt a consensus splice-site sequence and cause exon skipping, some result in intron inclusion, novel exon inclusion, or the usage of cryptic upstream or downstream splice sites. The resulting alteration in the protein sequence, which is often concomitant with frameshift and premature termination, unsettles the functional protein domains and leads to degeneration of the retina16. For example, mutations in several genes coding for core spliceosomal proteins, such as pre-mRNA splicing factors (PRPF3, PRPF4, PRPF6, PRPF8, PRPF31, RP9) or RNA helicases (SNRNP200), are responsible for adRP14,16,17. However, given that these genes are expressed ubiquitously in all tissues and are highly conserved in all eukaryotes, it remains unclear why mutations in these genes are associated exclusively with adRP. Studies performed in rodent retina showed that PRPF3, PRPF31, PRPC8 expression levels are higher in the retina than in other tissues in normal adult mice, thus suggesting that the retina may have a higher basal splicing demand than other tissues given that it is one of the most metabolically active tissues in the body16,19.

In order to effectively identify adRP mutations, we have sequenced 31 genes associated with the autosomal dominant inheritance pattern using the Ion PGM platform (IPGM; Life Technologies), in combination with Sanger sequencing. We selected these genes as they have been linked to most of the cases of adRP reported. Remarkably, we found a high prevalence of mutations affecting the splicing process among our families, especially mutations affecting trans-acting splicing factors. This is of particular interest considering that several splicing-based therapeutic approaches, some of which are in clinical trials15,17, are under active development for mutations affecting either core spliceosomal proteins or splice site mutations of individual genes.

The results of the present study will help in genetic counselling and will contribute to a better characterisation of the disease. Moreover, they may have a therapeutic impact in the near future in the light of analogous approaches used for other RNA mis-splicing diseases.

Results

High variant detection coverage and sensitivity was achieved

An average of 3.3 million reads/chip was obtained. On average, each amplicon present in the panel was covered 658 times, with 95.92% of amplicons with >30x coverage and 94.27% of amplicons with >50x coverage. Those regions with no or low coverage (<30X), probably due to the presence of repetitive sequences or self-annealing of primers, were re-analysed. A highly sensitive, cost-effective method described recently by us that combines high resolution melting (HRM) analysis with direct sequencing was used for this re-analysis20. This allowed us to expand our analysis to 97% of target amplicons. Despite the implementation of HRM, no additional mutations were found within these re-analysed regions.

Variant identification

An average of 45 variants, including SNPs and INDELS, were initially identified for each sample in the targeted regions, including the negative control with 51 SNPs, none of which were putative disease-causing as expected (see Supplementary Table S1). After the clinically relevant variant identification screening described in the materials and methods section, we were able to identify putative disease-causing mutations in a total of 14 out of the 29 probands, which resulted in a ratio of clinically relevant genetic findings of 48.28%. A description of the main features of the genetic findings can be found in Table 1.

Table 1 Summary of mutations responsible for Retinitis Pigmentosa.

A total of seven variants in four genes were found in 14 families. Two of these mutations (both in PRPF8) were novel and were found in two families. One consisted in a loss of 21 nucleotides (p.Val2325_Glu2331del) and the other consisted of a frameshift deletion involving a single-point deletion (p.Leu2315Leufs*2336Aspext*21). Figure 1 shows colour fundus pictures of patients RP90 and RP113 bearing these two novel mutations. Both novel variants were potentially pathogenic, co-segregated with the disease, and were predicted as pathogenic by MutationTaster.

Figure 1: Fundus photographs of patients with novel mutations in PRPF8.
figure 1

(A) Patient RP90 (p.Val2325_Glu2330del) shows optical disc pallor, arteriolar attenuation and macular atrophy (right), with dense pigment in the mid-periphery (left). (B) Patient RP148 (p.Leu2315Leufs*2336Aspext*21) shows optical disc pallor, arteriolar attenuation and bone spicule-shaped pigment deposits in the mid-periphery. The left and right pictures correspond to the left and right eyes, respectively.

Two genes were involved in 37.93% of our cohort of families, with RHO affecting four probands with three different mutations and SNRNP200 affecting seven probands, all with the p.Ser1087Leu mutation21,22.

The high prevalence of mutations affecting the splicing process among our families (11 out of 29 probands studied), representing 38% of the probands in our adRP cohort, was unexpected. Most cases (9/29) were due to mutations affecting the genes SNRNP200 (7) and PRPF8 (2), which code for core spliceosomal proteins, although a splice site mutation in RHO23 was also detected (2/29).

With respect to SNRNP200, after performing Sanger sequencing in all available family members we identified c.3260C > T mutation in a total of 12 cases from seven families (see representative family in Fig. 2A). Co-segregation analysis showed that two out of seven healthy subjects analysed for this variant in these families were mutation carriers, which likely indicates cases of incomplete penetrance similar to what has recently been reported for this variant in a study also involving a Spanish cohort7 (see Fig. 2B). We also found a total of nine individuals in two families with c.937-1G > T mutations affecting RHO splicing. Interestingly, one of these nine patients is asymptomatic, probably due to the disease being in an initial state given his young age (21 years old; see Fig. 2C and Supplementary Fig. S1).

Figure 2: Representative trees for families with the two most prevalent mutations found in SNRNP200 and RHO genes.
figure 2

The p.Ser1087Leu mutation in SNRNP200 was found in families RP64 (A) and RP102 (B). (C) The c.937-1G > T mutation in the RHO splice acceptor site in a total of six individuals from family RP133, one of whom is a young asymptomatic patient (arrowhead). Genotypes are annotated as +/− (heterozygote) or −/− (wild type). Arrows indicate proband patients.

Finally we also found mutations in both RHO and PRPH2 genes that were not related to the splicing process: a stop loss in RP10524 and a missense mutation in RP13525, both in RHO, and a missense mutation in PRPH2 (p.Gly266Asp) in patient RP19S26. Patient RP19S was included in this study since he is the son of a patient with a mutation in PRPH2 that we had diagnosed previously20. Patient RP19S was asymptomatic at the initial diagnosis, when he was eight years old. However, two years later his molecular diagnosis confirmed the presence of the p.Gly266Asp mutation, therefore he was re-examined. This revealed a granular fundus and few bone spicules in the inferior periphery, with no signs of optical disc pallor or vascular attenuation. The visual field showed a concentric defect (preserving the central 18 degrees) with a hyperautofluorescent ring in the macula upon autofluorescence examination (see Supplementary Fig. S2). Additional family trees of the rest of the patients recruited in the present study are included in Supplementary Fig. S3.

Discussion

In this work we have analysed the genotype and phenotype of a group of 29 adRP probands, using targeted NGS and Sanger sequencing to analyse 31 genes. We were able to detect putative disease-causing mutations in 14 out of the 29 probands analysed. This resulted in a clinically relevant genetic diagnosis ratio of 48.28%, which is comparable to values reported previously, ranging from about 24% to 88%6,7,27,28,29,30,31,32,33. Several factors may be responsible for this wide range of diagnosis ratios reported, including the approach used or the nature of the cohort involved. In the present study, part of our cohort of adRP patients was already diagnosed in a previous study in which we screened some of the most prevalent adRP genes14,20, therefore this might have contributed to the diagnostic ratio obtained.

Nevertheless there is still a missing fraction of about 51% unsolved cases among our adRP cohort of 29 patients. One possible explanation is the presence of mutations in regions outside the 31 genes analysed, such as deep intronic regions. Another possibility is the presence of changes not detected by our analysis due to limitations in the design of our panel of target genes, such as large genomic rearrangements and mutations in novel genes. As such, it seems that the combination of NGS with other technologies, such as Multiplex Ligation-dependent Probe Amplification (MLPA) or Comparative Genomic Hybridisation arrays (aCGH), will be needed in order to address those genomic aberrations caused by copy number variations (CNV). Another possible explanation is the presence of novel RP genes among our patients, since most of them belong to the Basque province of Gipuzkoa, a well-known genetically homogeneous region34. Consequently, sequencing of the whole exome/genome could help in the discovery of novel RP genes.

A remarkable finding was the high prevalence of mutations affecting the splicing process among our families (11 out of 29 probands studied), representing 38% of the probands in our adRP cohort.

Most mutations were the Ser1087Leu mutation found in SNRNP200. This gene encodes for the 200-kDa helicase hBrr2. During splicing, the spliceosome undergoes structural rearrangements that are mediated by several RNA helicases including hBrr2, which is essential for unwinding of the U4/U6 snRNP duplex, a key step in the catalytic activation of the spliceosome complex35,36. hBrr2 comprises two helicase modules, one active and the other with regulatory activity.

All six mutations identified in SNRNP200 to date, including the Ser1087Leu mutation, are located in the hBrr2 protein region containing the first DExD-helicase module, a key component for the U4-U6 unwinding function in vivo and in vitro and for cell survival35,36,37. The first of the two consecutive Hel308-like modules, which comprises a DExD/H domain and a Sec63 domain, shows the highest level of conservation among species, thus pointing to its functional relevance38. The Ser1087Leu mutation has been reported to reduce unwinding activity and to promote the use of cryptic splice sites, thus pointing to an influence of splicing fidelity22,39.

Although most cases (9/29) were due to mutations affecting genes SNRNP200 and PRPF8 that code for spliceosomal proteins, splice-site mutations in RHO were also detected (2/29). The percentage of adRP probands with mutations affecting either spliceosome core factors or the splice site of several adRP genes accounted for 5–14.5% of all cases of adRP in previous studies4,7,40,41. With regard to mutations in the SNRNP200 gene, although these were only initially described in two Chinese families21,22, they have since been reported to contribute to a significant portion of cases of adRP in the Caucasian population, ranging from 1.5% to 4.2%4,40,42,43.

The relatively high prevalence of splicing-related mutations found in our study is likely explained by the founder effect of two of the genes, which were present in very small and rather isolated Spanish populations.

Splicing modulation has been proposed as a therapeutic approach for several diseases. Two of the most advanced approaches in this regard are based on the use of modified antisense oligonucleotides (ASOs) to target specific RNA sequences and redirect splicing, and small molecules as modulators of the splicing process. A representative example of this approach is exon skipping for Duchenne muscular dystrophy (DMD), where the muscular protein dystrophin is prematurely truncated by mutations that disrupt the open reading frame, thus leading to a non-functional protein. Exon skipping creates an internally deleted and shorter than normal but partially functional protein, which leads to a much less severe phenotype in animal models of DMD. With respect to approaches based on small molecules and peptides, several splicing modulators have been shown to be effective in myotonic dystrophy (DM) and cancer18,44.

As regards retinal dystrophies, most advanced therapeutic approaches that target splicing are aimed at correcting the splicing of individual genes using mutation-adapted U1 small nuclear RNA for the RPGR gene45 or spliceosome-mediated RNA trans-splicing in RHO46. Both these approaches are based on cellular and animal models and have provided encouraging results. Once in the clinic, these promising approaches could be generalised and applied to other genes with splice donor site mutations45 and to all adRP genes rather than only to RPGR and RHO, respectively46.

With regard to therapeutic approaches targeting the splicing machinery, we are unaware of their use in retinal diseases. However, since the first steps towards the use of such therapeutic strategies have already been made for other diseases, it is plausible to imagine a broadening of the applications of small molecules to reverse aberrant splicing for other diseases, including retinal dystrophies, in the near future once our understanding of the mechanisms of the disease, and delivery systems and other technical issues, have been improved.

In summary, the combination of NGS with Sanger sequencing has allowed us to achieve a diagnostic rate of over 48%. As such, the methodology described herein exhibits a high diagnostic yield when applied to a well-defined adRP group and a relatively high number of genes. This will be of clinical relevance once ongoing studies on therapeutic options directed at manipulating splicing are completed.

Materials and Methods

Study subjects

RP patients were diagnosed at the Ophthalmology Department of Donostia University Hospital (San Sebastian, Spain). Diagnostic criteria were night blindness, peripheral visual field loss, pigmentary deposits resembling bone spicules, attenuation of retinal vessels, pallor of the optic disc and diminution in a- and b-wave amplitudes in the electroretinogram47. A total of 29 Spanish probands with a family tree compatible with adRP were included. Samples from an additional four patients, three corresponding to patients with known mutations that we had detected in previous analysis and one from a non-affected individual, were included as positive and negative controls, respectively14,20. Family trees were generated from information obtained from probands. All procedures performed in studies involving human participants received approval from the institutional research ethics committee and were in accordance with the Declaration of Helsinki (2013) or comparable ethical standards. Informed consent was obtained from all individual participants included in the study. For a detailed description of clinical features of all patients recruited in the present study see Supplementary Table S2.

Human sample collection

High molecular weight DNA was extracted from blood samples from RP patients and their available family members. Total DNA from samples was extracted and isolated using an AutoGenFlex STAR instrument (AutoGen, Holliston, MA, USA) together with the FlexiGene DNA Kit (Qiagen, Hilden, Germany) following the manufacturer´s instructions. DNA concentrations were measured using a Nanodrop spectrophotometer (ND-1000, Thermo Scientific NanoDrop Products, Wilmington, DE, USA) an only those samples with 260/280 ratios ≥1.8 and 260/230 ratios ≥2 were used. DNA samples were stored at −80 °C.

Amplicon Library preparation

A total of 663 primer pairs were designed and grouped in two Ion AmpliSeq Primer Pools to flank 31 IRD genes with a total coverage of 98.37% using the Ion AmpliSeq Designer software (www.ampliseq.com). The regions excluded by the design represented only 1.63% of the total. Although most of the genes were related to adRP, representative genes associated with dominant forms of Leber congenital amaurosis and cone-rod dystrophies were also included since the clinic symptoms associated with these genes are often hard to distinguish from those associated with RP (RetNet; https://sph.uth.edu/retnet/disease.htm) (see Supplementary Table S3). The Ion AmpliSeq Library Preparation Kit v2.0 (Life Technologies, Foster City, CA, USA) was used to construct an amplicon library from genomic target regions with a maximum read length of approximately 200 base pairs (average length, 142 bp) for shotgun sequencing on the PGM. Briefly, target genomic regions were amplified by simple PCR using Ion AmpliSeq Primer Pools and 10 ng of each genomic DNA samples.

Sequencing Analysis

Ion Torrent Personal Genome Machine (PGM)

NGS was carried out on a PGM following the Ion PGM 200 Sequencing Kit protocol. Briefly, enriched Ion Sphere particles (ISPs) were annealed with the Ion Sequencing primer and mixed with the PGM200 Sequencing Polymerase. The polymerase-bound and primer-activated ISPs were then loaded into the previously checked and washed Ion 316 Chips (Life Technologies) and, after selecting the run plan on the Ion PGM System software, these chips were subjected to 500 cycles of sequencing with the standard nucleotide flow order. Signal processing and base calling for the data generated during the PGM runs were performed using the Ion Torrent platform-specific analysis software Torrent Suite version 4.0 to generate sequence reads. The sequences generated were aligned to the GRCh37/hg19 human genome for detection of genomic variants in the sequenced samples.

Sanger sequencing

Sanger sequencing was used to confirm those mutations detected by NGS and for co-segregation analysis. Primers were designed at least 60 bp upstream and downstream of the mutation. The amplicons were purified after PCR amplification, (ExoSAP-IT, USB Corporation). Sequencing was performed by dye termination DNA reaction on a 16-capillary ABI 3130xl platform (Applied Biosystems) according to the manufacturer’s protocol. Sequences were analysed and compared with wild-type samples and reference sequences using the BioEdit Sequence Alignment Editor (Windows) and Ensembl and NCBI databases.

High resolution melting (HRM) analysis

HRM analysis was used to re-analyse those genomic regions with no or very low coverage in NGS platforms, following the previously described methodology20.

Relevant variant identification and pathogenicity score

In order to determine genomic variants of relevance, we selected putative disease-causing variants using the following criteria: 1) variants previously reported as pathogenic, or 2) loss-of-function variants, such as stop gain, frameshift, small deletions or duplications (INDELS) and splice site variants, or 3) novel missense variants predicted to be damaging or highly pathogenic in at least four out of five web-based pathogenicity predictors, namely SIFT (<0.05), Polyphen2 (>0.750); PROVEAN48; GVGD49; MutationTaster50. Furthermore, all variants selected had to fulfil the criteria of having a Minor Allele Frequency (MAF) of less than 0.002, as obtained from human genome databases (see below), and being absent from Spanish in-house allele database with information from 578 unrelated Spanish individuals none of whom exhibited any IRD-related disease51 (http://csvs.babelomics.org/; see Supplementary Fig. S4).

Additional Information

How to cite this article: Ezquerra-Inchausti, M. et al. High prevalence of mutations affecting the splicing process in a Spanish cohort with autosomal dominant retinitis pigmentosa. Sci. Rep. 7, 39652; doi: 10.1038/srep39652 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.