Somatic alterations compromised molecular diagnosis of DOCK8 hyper-IgE syndrome caused by a novel intronic splice site mutation

In hyper-IgE syndromes (HIES), a group of primary immunodeficiencies clinically overlapping with atopic dermatitis, early diagnosis is crucial to initiate appropriate therapy and prevent irreversible complications. Identification of underlying gene defects such as in DOCK8 and STAT3 and corresponding molecular testing has improved diagnosis. Yet, in a child and her newborn sibling with HIES phenotype molecular diagnosis was misleading. Extensive analyses driven by the clinical phenotype identified an intronic homozygous DOCK8 variant c.4626 + 76 A > G creating a novel splice site as disease-causing. While the affected newborn carrying the homozygous variant had no expression of DOCK8 protein, in the index patient molecular diagnosis was compromised due to expression of altered and wildtype DOCK8 transcripts and DOCK8 protein as well as defective STAT3 signaling. Sanger sequencing of lymphocyte subsets revealed that somatic alterations and reversions revoked the predominance of the novel over the canonical splice site in the index patient explaining DOCK8 protein expression, whereas defective STAT3 responses in the index patient were explained by a T cell phenotype skewed towards central and effector memory T cells. Hence, somatic alterations and skewed immune cell phenotypes due to selective pressure may compromise molecular diagnosis and need to be considered with unexpected clinical and molecular findings.

Patients' values were compared to age-matched references as previously described 1 or to healthy individuals. STAT3 tyrosine phosphorylation was assessed in PBMCs by flow cytometry using the BD Phosflow reagents per the manufacturer's instruction (BD Biosciences) and by western blot using antibodies to Phospho-STAT3 (Tyr705), STAT3 and beta-Actin (all Cell Signaling, Danvers, MA, USA) as previously described. 3 To analyze a putative effect of autoantibodies on STAT3 phosphorylation, PBMCs of a healthy control were incubated overnight in the absence or presence of 10% patient or different control sera (adopted from 4 ) and Phospho-STAT3 (Tyr 705) was measured by flow cytometry after IL6 and IL10 stimulation. The effect of ARHGAP32 on STAT3 phosphorylation was assessed by transfecting a wildtype ARHGAP32 vector (OriGene, Rockville, MD, USA) into healthy control PBMCs by nucleofection using the Human T Cell Nucleofector Kit and a Nucleofector 2b (both Lonza, Basel, Switzerland) followed by flow cytometric analysis as well as by assessing STAT3 phosphorylation in HAP1 ARHGAP32 knock-out cells and wildtype HAP1 cells (Horizon, Cambridge, UK) by western blot.

Genetic analyses
Targeted next-generation sequencing was performed as previously described 5 . Sample prep (TruSeq DNA, input 250 ng) and the Illumina's HiSeqX sequencing platform were used according to manufacturer's instructions to perform WGS. Illumina data were processed with the inhouse developed pipeline v1.2.1 (https://github.com/UMCUGenetics/IAP) including GATK v3.2.2 6 according to the best practice guidelines. 7,8 Briefly, paired end reads were mapped with BWA-MEM v0.7.5a 9 to GRCh37, duplicates marked, lanes merged, and indels realigned. Base Quality Score recalibration was left out since it did not improve our results significantly. Next, GATK Haplotypecaller was used to call SNPs and indels to create GVCFS. These GVCFS were genotyped with GATK's GenotypeGVCFs for the described family. Variants were flagged as PASS if none of the following criteria was fulfilled: QD<2.0, MQ<40.0, FS>60.0, HaplotypeScore>13.0, MQRankSum<-12.5, ReadPosRankSum<-8.0, snpclusters>=3 in 35bp. For indels: QD<2.0, FS>200.0, ReadPosRankSum<-20.0. Effect predictions and annotation was added using snpEFF 10 and dbNSFP 11 . De-novo variants were detected with GATKs' phase-bytransmission and filtering the Mendelian violations on the de-novo model and coverage >10x for every call. Sanger sequencing of the coding region and intron-exon boundaries of the DOCK8 gene was performed on gDNA and cDNA level using specific oligonucleotide primers, as previously described 3 . Primer sequences are available upon request. Amplified gene fragments were sequenced with an ABI 3730 capillary sequencer (Applied Biosystems, Carlsbad, CA, USA). Mutations were reported using the nomenclature of den Dunnen and Antonarakis. 12

DOCK8 mRNA expression analyses
Expression levels of DOCK8 splice variants were quantified by an intercalating dye (EvaGreen)-based approach in 96-well plates on a QX200 droplet digital PCR (ddPCR) system with automatic droplet generation (Bio-Rad Laboratories) in duplicates with reaction volumes of 21 µl, cDNA input of 5-20 ng RNA equivalent, and the following cycling conditions: 5 min at 95 °C, 40 cycles of (30 s at 96 °C, 1 min at 60 °C), 5 min at 4 °C, and 5°min at 90 °C. The following primers (synthesized by Integrated DNA Technologies, IDT) were used at a final concentration of 100 nM: DOCK8.36for: 5-TGC CAC CCT TTA CCT CCT CA-3, DOCK8.37rev: 5-TTC CCA CCA AAG ATG CCA G-3, DOCK8.24for: 5-GCC TGG TTC TTC TTT GAG CTT C-3, DOCK8.26rev: 5-AGA AAG CCA GGC TGA TGT TCA T-3. All ddPCR runs were performed with cDNA of patients or healthy carriers expressing the splice variant (verified by Sanger sequencing), cDNA of healthy individuals as negative control, and purified, nuclease-free water as no-template control (NTC). Droplet fluorescence intensity values were exported from QuantaSoft (Bio-Rad Laboratories). Custom scripts were developed and used to import the intensity values into R (version 3.2.3; http://www.r-project.org). To compensate for baseline shifts of fluorescence intensity between reaction wells, data were centered using the following procedure. First, the droplet with the highest fluorescence intensity value in NTC wells was identified and its fluorescence intensity was denoted as maxNTC. Next, droplets in each well were divided into two groups with either low (≤ 2*maxNTC) or high (> 2*maxNTC) intensity. For each well, the median intensity value of the low intensity group of droplets (medianLow) was calculated. Then, for each well, droplet intensity values were normalized by subtracting medianLow from each intensity value. Droplet intensity values were plotted after each step and inspected for negative and positive clusters. In order to automatically assign droplets to one of three groups (negative, positive for splice variant 1, or positive for splice variant 2), k-means clustering with pre-specified number of clusters k was performed for each well. Target mRNA concentrations c were then calculated for each well from the number of positive droplets Np and negative droplets Nn and the average droplet volume V = 0.85 nanoliter based on Poisson distribution statistics using the formula c = (ln(Np + Nn) -ln(Nn))/V, where ln is the natural logarithm. For each splice variant, only droplets positive for this particular splice variant were considered positive, and all other droplets were considered negative.

DOCK8 protein assessment
To assess DOCK8 protein expression in different lymphocyte subsets flow cytometry was performed adapted from 13

DOCK8 splicing analyses
Prior to the Sashimi plot and the percent spliced in (psi, ) analysis, the GTEx samples were filtered to obtain a more homogeneous dataset. 14,15 2616 samples passed the following filter steps: assembly=="HG19_Broad_variant", library_type=="cDNAShotgunStrandAgnostic", samples are no technical controls, and molecular_data_type=="Allele-Specific Expression". All reads over all samples were pooled together and only reads that mapped around the DOCK8 exons 32 and 36 were included in the Sashimi plots. To evaluate how efficient an intron is spliced out of a transcript, a psi analysis was performed. The  values for the 5' and 3' sites were calculated for each sample as: In silico analysis of splice site prediction was performed by utilizing multiple methods: NNSPLICE0.9, Human Splicing Finder (HSF) Version 3.1, SpliceAid2, SplicePort, and CryptSplice. 17-21 CryptSplice did not obtain results since the variant was too far away from the canonical splice site. Splicing was evaluated by minigene assay. In brief: PBMCs of healthy controls were transiently transfected with a wildtype and a mutated minigene plasmid using the human T-cell nucleofector kit (Lonza, Cologne, Germany) according to manufacturer's instructions. Minigene constructs were generated by cloning a PCR amplicon of human genomic DNA into a pCMV56 vector as previously described. 22 Full-length western blots of whole PBMC lysates of patient II.2 (a-d) and patient II.3 (e-h) in comparison to a healthy control and the parents are shown. Western blots were probed with different DOCK8 antibodies (immunogen indicated in brackets; aa: amino acid) and Actin as a loading control. Western blots were cut at the dashed lines where indicated (c, d, g, h). Exposure times are indicated; sec: seconds, min: minutes.