Whole-exome sequencing identifies OR2W3 mutation as a cause of autosomal dominant retinitis pigmentosa

Retinitis pigmentosa (RP), a heterogeneous group of inherited ocular diseases, is a genetic condition that causes retinal degeneration and eventual vision loss. Though some genes have been identified to be associated with RP, still a large part of the clinical cases could not be explained. Here we reported a four-generation Chinese family with RP, during which 6 from 9 members of the second generation affected the disease. To identify the genetic defect in this family, whole-exome sequencing together with validation analysis by Sanger sequencing were performed to find possible pathogenic mutations. After a pipeline of database filtering, including public databases and in-house databases, a novel missense mutation, c. 424 C > T transition (p.R142W) in OR2W3 gene, was identified as a potentially causative mutation for autosomal dominant RP. The mutation co-segregated with the disease phenotype over four generations. This mutation was validated in another independent three-generation family. RT-PCR analysis also identified that OR2W3 gene was expressed in HESC-RPE cell line. The results will not only enhance our current understanding of the genetic basis of RP, but also provide helpful clues for designing future studies to further investigate genetic factors for familial RP.

R etinitis pigmentosa (RP, MIM#268000), a heterogeneous group of inherited ocular diseases, results in 1 from 3,000 to 5,000 people affecting progressive retinal degeneration 1 . It is clinically characterized by some degenerative symptoms including progressive night blindness, tunnel vision and bone-spicule pigmentation in retina, then cause severe vision impairment and often blindness 2 . The disease is of highly clinical and genetic heterogeneity and could be inherited in autosomal dominant (about 30-40% of total cases), autosomal recessive (50-60%), and X-linked models (5-15%) 3 . Presently, 57 genes/loci have been identified to be associated with RP (https://sph.uth.tmc.edu/RetNet/home.htm), 32 of which were associated with autosomal recessive RP, 20 with autosomal dominant RP, and 5 with X-linked RP. However, a large part of the clinical cases still could not be explained by these genes.
So far, whole-exome sequencing has become a powerful strategy in the detection of rare causal variants of Mendelian disorders, including RP, because disease-causing mutations usually change an encoded protein [4][5][6][7][8][9][10][11][12][13][14][15][16][17] . However, many of these studies focus only on simplex families or one affected child from multiplex families. In this study, we selected a large four-generation family, in which six out of nine members in the second generation were affected by RP. Prescreen has excluded known causing mutations for RP. We aimed to identify possible causal genes of RP in this Chinese family using a whole-exome sequencing approach, together with validation by another independent three-generation family.

Methods
Subjects and clinical evaluation. We recruited a four-generation Chinese family from Chongqing in Southwest China. Six of nine members in the second generation affected RP (Figure 1-A). All participants underwent a full ophthalmologic examinations, including slit-lamp biomicroscopy, fundus examination, visual field test, and full-field flash electroretinography (ERG). Blood-derived DNA was available from  five cases II-1, II-2, II-3, II-4, II-7 and from twelve healthy family members including II-8, II-9, III-1, III-2, III-3, III-15, III-16, IV-1, IV-2, IV-4, IV-5, IV-6. The study was approved by Ethics Review Committee of Third Military Medical University and carried out in accordance with the Declaration of Helsinki. Peripheral venous blood samples were derived after a signed informed consent.
DNA extraction, mutation screening. Genomic DNA was extracted from peripheral leukocytes using the TIANamp Blood DNA Kit (Tiangen Biotech Co. Ltd, Beijing, China). To identify whether RP patients in this family were caused by unknown genes, the previously known genes (Supplementary Table 1) shown to be mutated in RP patients were first screened among one RP case (II-1) and one healthy control (II-8) using a targeted gene capture chip developed by BGI, Shenzhen, China. Sanger sequencing was then used to replicate the positive findings.
Whole-exome sequencing. The whole-exome sequencing approach was employed to identify the disease-associated genes in five subjects, including four RP cases (II-2, II-3, II-4, and II-7) and one healthy control (II-9) by BGI, Shenzhen, China. Thirty microgram (mg) human genomic DNA was extracted from peripheral venous blood samples of each participant. Qualified genomic DNA sample was randomly fragmented by Covaris Acoustic System. Then adapters were ligated to both ends of the resulting fragments. Extracted DNA was then amplified by ligation-mediated PCR (LM-PCR), purified, and hybridized to the Nimblegen SeqCap EZ Library v3.0 (Roche/NimbleGen, Madison, WI) for enrichment. Both non-captured and captured LM-PCR products were subjected to quantitative PCR to estimate the magnitude of enrichment. Each captured library was then loaded on Hiseq2500 platform (Illumina, San Diego, CA). We performed high-throughput sequencing for each captured library to ensure that each sample meets the desired average sequencing depth (903). Raw image files were processed by Illumina base calling Software 1.7 for base-calling with default parameters and the sequences of each individual were generated as 90 bp pair-end reads.
Bioinformatics analysis. The clean reads were aligned to the human reference genome (GRCh37, UCSC hg19) by SOAPaligner (soap2.21) 18 . Based on the results from SOAPaligner, software SOAPsnp (version 1.03) was used to assemble the consensus sequence and call genotypes in target regions 19 . When analyzing indel, BWA was used to map reads onto the reference 20, then we passed the alignment result to the Genome Analysis Toolkit (GATK) to identify the breakpoints 21 . Only mapped reads were used for subsequent analysis. Coverage and depth calculations were based on all mapped reads and the exome region. All variants were first filtered against several public databases for the minor allele frequency (MAF) . 0.5%, including dbSNP135, 1000 genomes data (pilot1, 2, 3), hapmap (release 24), YH project 22 , then against two in-house databases (sample size were 7,000 from Vanderbilt Epidemiology Center and 1,414 from BGI, respectively; samples of both databases come from Chinese population, which have the similar genetic background with the subjects in current study).

Mutation validation.
To determine whether any of the remaining variants cosegregated with the disease phenotype in this family, the mutations were then confirmed in all other family members that DNA samples were available by Sanger sequencing. Direct polymerase chain reaction (PCR) products were sequenced using ABI 3730 Genetic Analyzer. Sequencing data were compared pair-wisely with the Human Genome database (GRCh37, UCSC hg19) to detect mutations. The possible causative mutation was further confirmed using RP pedigree database of GBI.

Results
Clinical characteristics. Figure 1-A presents the pedigree of the four-generation Chinese family, which was consistent with autosomal dominant inheritance. Totally there are 7 members in this family affected RP, including two deceased members (I-1 and  II-5). The two deceased members showed similar clinical symptoms and pathogenesis with other 5 alive members (II-1, II-2, II-3, II-4, II-7). Night blindness appeared first, followed by progressive reduction of the visual field, and finally complete blindness in later life. Table 1 presents the clinical data of 5 alive affected individuals. All patients had a progressive bilateral decrease of visual acuity, peripheral visual field, and photophobia. Fundus photography revealed similar clinical features for the affected individuals, including attenuation of retinal vascular, bone-spicule pigmentation, chorioretinal degeneration with peripapillary atrophy, optic disc pallor, and enlarged optic cups, comparing with the normal subject ( Figure 1-B, 1-C). ERG records showed no detectable cone or rod responses in the patients.
Mutation screening. To find the causative mutations and exclude the known genes, we sequenced all exons and the flanking intronic splicing sites of the previously known causative genes of RP (Supplementary Table 1) among one RP case (II-1) and one healthy control (II-8), and confirmed by Sanger sequencing. All genes showed no pathogenic mutations, indicating the possibility of the familial cases in current study were caused by mutations in unknown genes.
Whole-exome sequencing. Whole-exome sequencing was performed upon five subjects, including four RP cases (II-2, II-3, II-4, and II-7) and one healthy control (II-9). An average of 11,747 MB raw data was generated with a mean depth of 101.74-fold for the target regions. Approximately 98.64% of the targeted bases (64,482,551 bp in length) were covered sufficiently to pass our thresholds for calling SNPs and indels. We identified 144,701-150,367 SNPs and 15368-16173 indels for the five sequenced subjects. For rare inherited diseases, the frequency of the possible pathogenic mutations in healthy population should be very low. Therefore, as shown in Table 2 and Table 3, the results were then filtered against several public variation databases, removing all previously reported variants. We focused only on non-synonymous (NS) variants, variants in splicing sites, and short, frame-shift coding insertions or deletions (Indels). After filtering against these databases, we found 72 SNPs and 15 indels were shared by affected patients and absent in healthy controls. Furthermore, two in-house databases were used to filter the remaining variants, which resulted that 10 SNPs were left (OR2W3 R142W, DNM2 R297H, ROBO2 P1106S, CSMD3 K3075Q, ZHX2 G799R, PALM3 E658Q, HAP1 E269Q, BRIP1 N775S, INTS2 I775L, and TSSC4 H81R).
Phenotype & genotype co-segregation and validation of the mutations. The ten remaining mutations were then confirmed in other twelve family members that DNA samples were available by Sanger sequencing to co-segregate with the disease phenotype ( Figure 2). Genetic analysis demonstrated that only OR2W3 (Olfactory receptor 2, W3) R142W was carried by affected patients and absent in healthy controls. Then, OR2W3 R142W mutation was also observed in another three-generation RP family (Figure 1-D), including 3 cases (II-1, II-2, III-1) and 1 control (I-1); three RP cases were found to carry the same mutation and one healthy control does not. Furthermore, immunofluorescent analysis of HESC-RPE revealed the expression of RPE cells markers (Mitf, PAX6, and ZO-1), while RT-PCR analysis showed that HESC-RPE expressed OR2W3 (Figure 3).
2 Functional_Indels include variants of frameshift, cds-Indel, spliceSite. 3 In this step, variants were filtered by mutations of healthy control: II-9.  The olfactory receptors (ORs), including OR2W3, were first defined as a supergene family that encodes G-protein coupled receptor proteins (GPCRs) in olfactory epithelium of the rat in 1991 29,30 . Zhao et al. explored the physiological function of ORs in initiating transduction in olfactory receptor neurons 31 . However, ORs were not exclusively expressed in the olfactory epithelium. Recent studies have demonstrated ORs were expressed in a broad variety of other tissues, including autonomic nervous system, brain, tongue, erythroid cells, prostate, placenta, gut and kidney 32 . Furthermore, RNA sequencing of 16 different human tissues by Next Generation Sequencing (NGS) revealed OR2W3 gene were expressed in 9 different tissue samples, and most highly expressed in thyroid 33 . These indicated the different potential functions of OR2W3 gene in different human biological process.
OR2W3 gene, which was located in 1q44, has an intron-free reading frame of 942 nucleotides that encodes 314 amino acids. UCSC Genome Browser 34 showed that OR2W3 shares exons with Trim58 (Tripartite motif-containing protein 58). When we used SWISS-MODEL server 35 to model the structure of OR2W3 protein, JAGGED-1 (PDB ID: 2vj2B) 36 , which was also associated with one kind of autosomal dominant inherited disease -Alagille syndrome 37 , showed the biggest sequence identity with OR2W3. Recent studies also revealed that the biological functions of OR2W3 gene was not only restricted to olfactory system, like G-protein coupled receptor activity and olfactory receptor activity. Aston et al. 38 and Plaseski et al. 39 found OR2W3 rs11204546 was associated with both azoospermia and oligozoospermia risk; a mutation in OR2W3 gene (chr1:248059606, p.T240P) was associated with the metastasis of pancreatic ductal adenocarcinoma 40 ; expression of OR2W3 was also identified to be associated with long-tern schizophrenia 41 , variability in response tob-blockers 42 , and the changes in global geneexpression profiles in human cervical cancer HeLa cells exposed to non-activated Dendrimers and Dendriplexes 43 . However, through epidemiological survey and Medical record retrieval, all the subjests in current study don't have related diseases and mutations.
Vision and olfaction are two of the major sensory systems, which coordinate and integrate the information to provide us a unified perception of our environment. Studies showed that they share many links and common points in different aspects, including neuroanatomical pathways 44 , cross-modal links and the extension of this notion to goal-directed actions 45 , pathogenic or biological genes 46-48 . Woodard et al. 47 found rdgB (retinal degeneration B), a gene required for normal visual system physiology, was shown to be necessary for olfactory response of both adult flies and larvae, indicating that rdgB was required for both visual and olfactory physiology. Loss of olfactory receptor genes were also found to coincide with the acquisition of full trichromatic vision 46 . In this study, we revealed a novel missense mutation in OR2W3 gene, was associated with autosomal dominant RP. This finding may indicate the essential links between Vision and olfaction, and strongly suggested an exchange in the importance of these two senses.
As we mentioned above, RP refers to a highly clinical and genetic heterogeneous group of inherited ocular diseases. Inheritance patterns included autosomal dominant, autosomal recessive, and X-linked models. In this study, we presumed autosomal dominant to be the inheritance pattern of this family basing on two reasons. First, both the first two generations have affected patients. We excluded the possibility of intermarriage through intensive epidemiologic survey. Second, high prevalence rate (6/9 5 66.7%) in the second generation. Nevertheless, we also analyzed the data based on the autosomal recessive model, including homozygous inheritance model and compound heterozygous model, but no promising mutations were detected. One limitations of this study is that due to patient's refusal for retinal biopsy, the results could not be strengthened by RNA analysis of this gene or immune-localisation of the protein using multiple tissues including the retina and retinal pigment epithelium(RPE) cells.

Conclusion
A novel missense mutation (OR2W3 R142W) was identified to be associated with RP by whole-exome sequencing. Our findings expand the phenotypic and mutation spectrum of RP and provide helpful clues for designing future studies to further investigate genetic factors for familial RP.