Neurofibromatosis type 1 (NF1; MIM#162200) represents one of the most frequent autosomal dominant conditions with a worldwide birth incidence of 1 in 2500. Though fully penetrant, the disease shows an important variable expressivity among patients and over time. NF1 frequent manifestations include pigmentary manifestations (café-au-lait spots, skinfold freckling, and Lisch nodules), skeletal abnormalities (scoliosis and long bone dysplasia), behavioral symptoms (cognitive impairment, attention-deficit/hyperactivity disorder, and autism spectrum disorder), and tumors (peripheral nerve sheath tumors, and gliomas). Other, albeit rare, manifestations include juvenile xanthogranuloma, pheochromocytomas and gastrointestinal stromal tumors.

NF1 is caused by germline heterozygous loss-of-function variants in the NF1 gene (MIM#613113), located at 17q11.2. NF1 comprises 58 constitutive and two alternative exons and spans over ~280 kb. The 8517-nucleotide open reading frame of the NF1 gene (NM_001042492.3) encodes a 2839-amino-acid protein, neurofibromin, which shows tumor suppressor function by negatively regulating the RAS-MAPK pathway. There is a very large number of different loss-of-function NF1 pathogenic variants which were identified; for example, more than 3700 unique NF1 variants (mostly germline variants) have been reported in the Global Variome shared Leiden Open Variation Database. NF1 pathogenic variants are distributed through the entire coding region and splice sites with no hotspot. A significant part of these variations corresponds to large rearrangements [1, 2].

The identification of affected individuals has classically relied on clinical assessment and diagnosis according to standardized NIH criteria [3]. A recent international effort led to revise the criteria for NF1 diagnosis with the incorporation of genetic testing into the revised NF1 diagnostic criteria [4]. It can be expected that genetic testing will become standard-of-care for a definite diagnosis, which is becoming increasingly relevant with constantly improving strategies for clinical management and genetic counseling.

The advent of short-read next generation sequencing (NGS) has greatly contributed to improving the diagnostic yield and decreasing the time to molecular diagnosis of NF1. An NF1 pathogenic variant can be identified in more than 95% of NF1 cases [5]. Classically, molecular diagnostic strategies for NF1 include short-read NF1 targeted-NGS of DNA samples (most extracted from leukocytes). RNA approaches are also being developed [6]. For some cases that are negative with classical approaches, more comprehensive methods or methods that allow easier detection of large rearrangements can be used. Short-read whole-genome sequencing (WGS) have proven to be useful in overcoming diagnostic deadlocks [7]. Recent long-read approaches may also be of interest [8]. Long-read technologies has led to significant discoveries in individual laboratories. In larger programs, among the noted implementations are those in the Vertebrate Genomes Project (VGP), the Telomere-to-Telomere Consortium (T2T), and the ongoing Human Pangenome Reference Consortium (HPRC).

In a recent study [9], Alesi et al. used an optical genome mapping (OGM) technique to characterize large inversions implicating the NF1 gene in two unrelated patients clinically diagnosed with NF1. In one patient, two NF1 intragenic deletions of exons 4–7 and 31–35 (double heterozygote in cis) were first identified using a specific MLPA assay. OGM analysis then revealed that this structural variant was actually an intragenic inversion with co-occurring short deletions at the breakpoints. Alesi et al. drew to our attention that one of the patients we previously reported [10], showed a double NF1 intragenic multi-exon deletion, using a specific MLPA (multiplex ligation-dependent probe amplification) assay. This condition was like what Alesi et al. observed in one of the two patients they described. We used the long-read nanopore sequencing technique (Oxford Nanopore Technologies, ONT) to reanalyze and better characterize this structural variant.

We performed complementary genetic analyses in the NF1 index case with a double NF1 multi-exon deletion, from the French NF cohort [10]. The patient was enrolled in a French clinical research program (Programme Hospitalier de Recherche Clinique) entitled “Study of NF1 expressivity”. His full phenotypic information was recorded by a referent medical clinician, based on a standardized questionnaire. At the age of 21, the patient presented with 15 café-au-lait spots, bilateral inguinal and axillary freckling, numerous cutaneous and subcutaneous neurofibromas, one plexiform neurofibroma, learning disabilities, pseudarthrosis, dystrophic scoliosis, and vertebral dysplasia. He inherited the disease from his mother for whom DNA was not available.

For initial analysis, DNA was extracted from an EDTA blood sample, using a standard proteinase K digestion followed by phenol-chloroform extraction. A double multi-exon deletion was identified in the patient (exons 32–36 and 49–58 of NF1) as a result of NF1-targeted NGS sequencing and multiplex ligation-dependent probe amplification (MLPA) analysis with the SALSA MLPA kits P081/P082 NF1 (MRC Holland), as previously described [1].

We performed a long-read sequencing on the platform Oxford Nanopore MinION (ONT) with a CRISPR/Cas9-targeted enrichment, according to the manufacturer protocols. Specific guides were designed to target NF1 IVS 31, 36, and 48, with a 7–15 kb spacing, using the CHOPCHOP website [11] (sequences available on request). Analysis of the generated reads showed a 49 kb deletion encompassing NF1 exons 32–36, and an in cis 56 kb deletion from NF1 intron 48 (NM_001042492.3) to the first exon of RAB11FIP4 (NM_032932.6). Breakpoint-spanning PCR and Sanger sequencing were performed for wild-type and mutated NF1 alleles, using specific primers (available on request). Sanger sequencing of the two breakpoint junctions confirmed the two molecular events, their genomic locations, and the absence of concomitant inversion (Fig. 1). The apparently independent deletions could be described as follows (GRCh38/hg38): NC_000017.11:g.31258310-31307676delinsCTTTATATTAA and NC_000017.11:g.31344192-31400545delinsGAAGGGGCCGG. No sequence similarities could be evidenced between the regions implicated in the double-deletion event (IVS31, 36, and 48 of NF1, and IVS1 of RAB11FIP4). We however observe a short duplication of a few base pairs at the junction sites (Fig. 1).

Fig. 1: Complementary molecular analysis showing double multi-exon deletions.
figure 1

Visualization of the deleted regions in UCSC Genome Browser (GRCh38/hg38) and Sanger sequencing analysis of wild-type sequence and patient’s deletions of (A) NF1 exons 32–36, and (B) NF1 exons 49–58 to the first exon of RAB11FIP4. Red dotted rectangles show duplicated motifs.

In their study, Alesi et al. [9]. resolved the molecular structure of two NF1 inversions with an OGM technology. However, the limited resolution of this technique to detect structural variants (more than 1 kb) makes it impossible to identify the exact breakpoints of such events, a limitation they overcame by resorting to WGS. Using a nanopore long-read sequencing technology, we could achieve a sequence-level resolution for the molecular events previously observed in our patient. Knowing the approximate breakpoints of the deletions evidenced from NGS sequencing and MLPA, we performed CRISPR/Cas9-targeted enrichment of the region of interest. This approach allowed us to overcome the need for high molecular weight DNA for high-quality long-read sequencing. The enrichment approach also limits the need for total sequence quantity since the sequences of interest that have been cut by the CRISPR/Cas9 system are sequenced in a preferred way over the rest of the genome. We then confirmed the junctions sequences with Sanger sequencing. This sequential strategy allowed the characterization of the double deletion event at the molecular level (Fig. 1) and showed the absence of inversion between these two large intragenic deletions. OGM and Nanopore sequencing may be considered as complementary tools to finely characterize complex molecular events that are missed or incompletely described by conventional short-read sequencing technologies [12]. However, application of a double approach in routine diagnosis can still be limited by its cost.

This case report, together with the two patients reported by Alesi et al. highlights the benefits of long-read technologies in the characterizations of complex structural rearrangements and repetitive elements or in the capture G + C-rich regions (mainly found in gene regulatory regions). These accurate molecular annotations using long-read techniques will undoubtedly lead to a better understanding of the mechanisms at the origin of large and complex rearrangements or regulatory sequence alterations that are often poorly described by short-read NGS technologies or CNV analysis techniques. In addition, accurate typing of DNA alterations is an essential prerequisite for the description of reliable genotype-phenotype correlations that may improve genetic counseling and management of patients with genetic diseases.