Introduction

Retinitis pigmentosa (RP; MIM #268000) affects 1 in 3,000 to 5,000 people worldwide. It is one of the most common forms of inherited retinal degenerations1. RP typically starts with night blindness during the first two decades of life and progresses gradually to tunnel vision and eventual complete blindness in some patients. These clinical manifestations result from progressive dysfunction and death of rod photoreceptors followed by cone photoreceptors throughout the retina. The clinical features of RP are highly variable and may overlap with other inherited retinal degenerations, such as cone-rod dystrophy (CRD), Leber congenital amaurosis, and Usher syndrome. In addition to the clinical heterogeneity, the genetic etiologies of RP are also complex. RP may present various modes of inheritance including autosomal dominant, autosomal recessive, X-linked and digenic in rare cases1. To date, mutations in at least 80 genes have been associated with RP (RetNet, https://sph.uth.edu/retnet/). Therefore, accurate molecular diagnosis of RP patients is challenging but essential for better patient management and personalized treatment.

In recent years, a number of studies have applied next-generation sequencing to understand the molecular basis of human Mendelian disorders including RP2,3,4. Specifically, customized target capture sequencing was used to screen mutations in known disease-causing genes with high efficiency. By this method, novel disease-causing alleles as well as genotype-phenotype correlations have been identified, leading to a substantial enhancement of our understanding of allele pathogenicity, protein function and population genetics. Hence, NGS-based molecular diagnosis has been proven as a robust approach for assessing Mendelian disease on a molecular level.

Hispanic Americans are residents of the United States (US) descending from Latin America countries or Iberian peninsula5. They account for 17% of US population and represent a fast-growing ethnic group6, highlighting a need for understanding the molecular basis of genetic disorders. However, the mutation spectrum of RP in this population has not been evaluated before except in isolated cases. We collected 35 unrelated Hispanic RP probands from the Miami area and performed next-generation sequencing-based mutation screening using a panel of 226 retinal disease genes. Our data helped to solve 23 probands and revealed potential characteristics of RP mutation spectrum in this population.

Results

The DNA of 35 unrelated Hispanic RP probands were collected and sequenced

A total of 35 probands with a primary diagnosis of RP were recruited in our research. Among them, there are 10 adRP cases, 11 arRP cases, 2 X-linked cases as well as 12 simplex ones (Fig. 1A). Individuals of Cuba origin account for the largest proportion (40%), followed by Colombia (14%) and Puerto Rico (14%). The probands’ DNA underwent capture enrichment and high-throughput sequencing. High quality sequencing data with an average of 91× coverage were achieved. The sequence coverage for the targeted regions is evenly distributed with evenness scores of 0.8 across all samples. Consistently, 96.0% of the targeted regions have coverage >20× and 91.1% of the targeted regions have coverage >40× (Fig. 1B). An average of 4,852 single nucleotide polymorphisms (SNPs) and 1146 small insertions and deletions (INDELs) were obtained for each sample. After filtering and annotation, an average of 16 rare variants remained.

Figure 1
figure 1

Summary of the 35 Hispanic RP probands and NGS coverage statistics.

(A) The inheritance patterns of the probands. (B) The NGS coverage statistics on target regions.

Putative pathogenic mutations were identified in 23 probands

To identify pathogenic variants for these probands, we applied a stepwise mutation identification strategy as previously described3. As shown in Table 1, putative pathogenic mutations were found in 23 probands, achieving a solving rate of 66%. In total, we identified 28 disease-causing alleles and they are listed in Table 1. Of these alleles, 13 were reported previously and 15 of them are novel. The genetic evidences of nine putative pathogenic missense variants including their frequencies and in silico prediction results are listed in Table 2. All putative mutations were validated by Sanger sequencing and co-segregation test was performed if DNA samples from family members were available.

Table 1 Putative pathogenic variants identified in 23 tentatively solved RP probands.
Table 2 Population frequencies and in silico predictions of novel missense putative pathogenic variants.

PRPF31 is frequently mutated in the Hispanic RP cohort

For all ten adRP cases, mutations in PRPF31, a gene involved in pre-mRNA processing, account for five of them. The proband BLM001 (fundus and OCT images shown in Fig. 2A–D) possesses a novel PRPF31 protein-truncating mutation (c.A172T, p.K58*). By Sanger sequencing, both the proband’s asymptomatic father and her sister with a mild retinal phenotype have this mutation (Fig. 3A), showing the incomplete penetrance and variable expressivity in this family. In proband BLM037, we identified a PRPF31 frameshift mutation previously reported in a large Mexican family (c.866_879delGGAAAGCGGCCCGG, p.R289Pfs*30)7. The family in the present study is also of Mexican origin and shows a low penetrance (Fig. 3B), similar to the reported family7. More strikingly, as reported in the original study7, an apparent sex bias with females more likely being affected was observed (Fig. 3B). However, the at-risk individuals in this family are not available for detailed clinical examinations, hindering an accurate evaluation of penetrance and expressivity.

Figure 2
figure 2

Clinical features of selected probands.

Fundus images of BLM001, OD (A) and OS (B) OCT images of BLM001, OD (C) and OS (D) Visual field tests of BLM049, OD (E) and OS (F). Fundus images of BLM033, OD (G) and OS (H). OCT images of BLM033, OD (I) and OS (J).

Figure 3
figure 3

Selected family pedigrees discussed in this study.

(A) Pedigree of adRP BLM001. The proband’s sister labeled in grey indicates mild retinal phenotype. (B) Pedigree of adRP BLM037. Individuals labeled with dot are obligate carriers of the PRPF31 mutation. (C) Pedigree of X-linked RP BLM049 with CHM mutation.

A novel splicing mutation (c.322 + 4_322 + 7delAGTG, p.?) in PRPF31 was identified in three unrelated adRP probands (BLM067, BLM043 and BLM101). This deletion is located at the 5′ terminal of intron 5. It is predicted to disrupt the splice donor site and likely to be a loss-of-function mutation. Interestingly, all three adRP families originated from Cuba. In addition, we identified a rare variant (chr19:54632400, C > A, ExAC frequency: 9 in 25,962) 7 kb away from the PRPF31 mutation shared by all three families, strongly suggesting that three probands share the same haplotype and this allele is likely to be a founder mutation in Cuba.

Mutations in non-canonical RP genes were identified in four cases

In one patient BLM071, compound heterozygous missense mutations in WDR19 were identified (c.G3533A; p.R1178Q and c.A2561C; p.K854T). The K854T mutation was novel, and the R1178Q mutation was previously reported in a patient with nephronophthisis (NPHP), polydactyly, Caroli disease and retinal dystrophy8. Mutations in WDR19 are typically associated with a series of skeletal and kidney ciliopathies, while in one report, WDR19 mutations were found in non-syndromic RP patients. We revisited this 32-year-old patient and confirmed that she has no systemic symptoms. This case further supports that WDR19 mutations can lead to non-syndromic arRP.

RP can frequently overlap with other retinal dystrophies due to their similarities in affected tissues and disease progression. In the male proband BLM049, we identified a known hemizygous splicing mutation (c.116 + 1G > A, p.?) in CHM. The pedigree also appears an X-linked mode of inheritance (Fig. 3C). Mutations in CHM were reported to cause choroideremia, a disease that shares night blindness and progressive tunnel vision phenotype with RP. The 49-year-old proband shows severe loss of peripheral vision (Fig. 2E,F) and diffuse pigmentary retinal degeneration with macular atrophy, which is a late-stage RP phenotype that may be difficult to distinguish from late-stage choroideremia (Supplementary Figure 1). Similarly, in probands BLM033 and BLM066 (Clinical data shown in Fig. 2G–J, Supplementary Figures 2 and 3), we identified pathogenic variants in two CRD-causing genes (C21ORF2, IMPG1), and this led to clinical re-diagnosis of CRD after reassessment of the probands’ clinical phenotypes.

Discussion

Contemporary Hispanic Americans are mainly from U.S. territorial expansion of former Spanish-speaking regions from 1819 to 1848, as well as a marked increase of immigration from Latin American countries since mid-20th century9. According to the U.S. Census Bureau, about half of the Hispanic Americans are of European origin and over 40% are of European-African or European-Native American mixed origins6. Thus, the RP mutation spectrum of Hispanic Americans would in part resemble that of Europeans (particularly Spanish), as well as include certain alleles representing the distinct genetic admixture. The capture sequencing data in our study is in concordance with this hypothesis. Of all the 13 previously reported disease-causing alleles, 9 are from European studies, including 6 identified in Spanish RP families, supporting the population migration history from Spain to the Americas since the 16th century. On the other hand, our data also suggest unique RP genetic etiologies in the Hispanic Americans. For example, EYS is a frequently mutated gene in previously reported cohorts including Spanish population10,11. However, we did not identify any case with EYS mutations in all 25 arRP and simplex RP cases. PRPF31 is mutated in less than 10% of adRP cases in previous studies12,13 while in our study, PRPF31 mutations account for 5 out of 10 adRP families. We also identified several population-specific pathogenic variants, like the PRPF31 founder splicing mutation in Cuba-origin families and the USH2A variant (c.C13664T; p.P4555L) only found in Latino controls in ExAC database. These observations suggest that the genetic admixture with African and Native Americans over several centuries has resulted in an RP mutation spectrum different from that of the European/Spanish population.

One gene of interest in our study is PRPF31, not only because of its seemingly higher prevalence in the adRP cases, but also due to the variable penetrance and sex bias we observed. Given PRPF31 disease-causing alleles are enriched with loss-of-function variants, we can infer that PRPF31 mutations lead to adRP through a haploinsufficiency mechanism. Studies have shown that differential PRPF31 expression level and some genetic modifiers contribute to the incomplete penetrance or variable expressivity of PRPF31-associated adRP14,15,16. The sex bias observed in both Mexican families in our study and the previous report7, is likely to be specifically associated with this sub-population, or even this allele per se, since no other PRPF31-associated adRP has been found with similar phenomenon. The low penetrance and sex bias also features the particular need of genetic testing and subsequent counseling for asymptomatic disease-causing allele carriers.

Our data confirmed the contributions of mutations in non-canonical RP genes in non-syndromic RP cases. On the molecular level, RP can frequently overlap with other retinal degenerations such as CRD and Stargart disease17, which is reflected by mutations in CHM, C21ORF2 and IMPG1 in our study. This phenomenon has increasingly led us to rethink how the molecular diagnosis aid the clinical diagnosis in the precision medicine era. In addition, hypomorphic mutations in a dozen of syndrome-causing genes can lead to milder phenotypes (non-syndromic RP), which is shown by a list of recent WES studies18,19,20,21,22,23, and in our study, by the WDR19 case. These studies often significantly expanded the phenotype spectrum associated with syndrome-causing genes, sometimes unexpectedly, highlighting the complicated molecular etiology underlying human retinal degenerations. Hence, we suggest that additional genes, particular those genes associated with retinal phenotypes in Mendelian disorders but with an underappreciated retinal function, need to be included in the target capture sequencing panel for revealing further genetic complexity in non-syndromic RP.

In 12 probands, we did not identify putative disease-causing variants. Several approaches may contribute to reveal the missing heritability: first, identifications of mutations in a novel disease-causing gene require whole-exome sequencing studies; second, copy number variation in known RP genes may need to be solved by higher coverage target capture sequencing or array comparative genomic hybridization (aCGH); third, the unidentified mutations in known RP genes may reside in deep intronic regions that affect pre-mRNA splicing or regulatory sequences. Future studies focusing on non-coding regions with more available whole genome sequencing control data and improved ability of variant annotation would help to unravel pathogenic mutations of this category.

In summary, to our knowledge, this study represents the first molecular diagnosis of a sizeable Hispanic RP patient cohort in the U.S. Our targeted NGS-based diagnostic approach successfully solved 23 out of 35 RP cases with the identification of 17 novel disease-causing alleles, thus improving our knowledge of RP etiology in this population. It should be noted that the genetic components of Hispanic Americans are not homogeneous and vary from differences in geographic regions and countries of ancestral origin, etc. Hence, additional molecular diagnostic studies with detailed ethnic classification would allow more accurate characterization of RP spectrum and provide better patient care in this fast-growing population in the U.S.

Methods

Clinical diagnosis of RP patients and sample collection

The studied cohort consisted of 35 probands and their family members recruited in the Miami area. A detailed clinical history and complete ophthalmic examination including best-corrected Snellen visual acuity, visual fields testing, slit-lamp biomicroscopy, fundoscopy and full-field electroretinogram were performed on each patient. The research was accomplished in accordance with the tenets of the declaration of Helsinki. Written informed consents were obtained from each participant or legal guardians. Peripheral blood was collected in EDTA tubes for DNA extraction. All experimental methods were approved by the Institutional Review Boards of Baylor College of Medicine and University of Miami Miller School of Medicine, and they were performed in accordance with relative guidelines and regulations.

Library preparation and capture sequencing

Pre-capture Illumina libraries were generated as described in previous literature24,25,26. Briefly, 1 μg of genomic DNA was sheared into 300–500 bp fragments. The 5′ ends of the DNA fragments were phosphorylated by polynucleotide kinase and a single adenine base was added to the 3′ ends using Klenow exo-nuclease. Y-shape index adapters were ligated to the DNA fragments, and then 10 cycles of PCR amplification were applied to each sample. Finally, 300–500 bp fragments were isolated by bead purification. The pre-capture libraries were quantified by the PicoGreen fluorescence assay kit (Invitrogen, Carlsbad, CA, USA). For each capture reaction, 25–50 samples were pooled together. The targeted DNA was captured, washed and recovered using Agilent Hybridization and Wash Kits. (Agilent Technologies, Santa Clara, CA, USA). Captured DNA libraries were sequenced on Illumina HiSeq 2000 (Illumina, Inc., San Diego, CA) as 100 bp paired-end reads following the manufacturer’s protocols. The genes included in target capture panel27 were listed in the Supplementary Information (Supplementary Table 1).

Bioinformatics analysis

Paired-end sequencing reads were obtained and aligned to human hg19 genome using BWA version 0.6.1. Base quality recalibration and local realignment was done by the Genome Analysis Tool Kit version 1.05974. Atlas-SNP2 and Atlas-Indel2 were used for calling SNPs and Indels. Variant frequency data were obtained from a set of public and internal control databases including Exome Aggregation Consortium (ExAC) database, CHARGE consortium28, ESP-650029 and 1000 Genome Project30. Since RP is rare Mendelian disorder, variants with a frequency higher than 1/200 (for a recessive model) or 1/10,000 (for a dominant model) were filtered out. Then synonymous and deep intronic (distance >10 bp from exon-intron junctions) variants were excluded from following analysis. ANNOVAR (version 11/12/2014) and dbNSFP suite (version 2.9, contains SIFT, PolyPhen-2, LRT, MutationTaster, MutationAssessor, etc.) were used to annotate protein-altering changes. Known retinal disease-causing alleles were detected based on the HGMD professional database (version 11/15/2014).

Sanger validation and co-segregation test

Each putative disease-causing mutation was validated by Sanger sequencing. Primer3 was used to design a pair of primers to generate amplicons that cover 500 bp region around the mutation site. The PCR amplicons were Sanger sequenced on an ABI 3730XL Genetic Analyzer. The results were analyzed by Sequencher 5.0.

Additional Information

How to cite this article: Zhang, Q. et al. Next-generation sequencing-based molecular diagnosis of 35 Hispanic retinitis pigmentosa probands. Sci. Rep. 6, 32792; doi: 10.1038/srep32792 (2016).