The integration of massively parallel sequencing (MPS) (gene panel, exome, or whole genome sequencing) into clinical practice globally means that a diagnosis is increasingly able to be made for children presenting with global developmental delay/ intellectual disability [1]. However with over 1500 recognised or candidate monogenic causes of intellectual disability [2], all are rare and most are ultra-rare conditions, affecting fewer than 1:2,000 or 1:2,000,000 of the population respectively [3]. Thus, although a genetic diagnosis is more readily available for children with suspected genetic causes of their intellectual disability or developmental delay, there remain many challenges and uncertainties for clinicians attempting to translate a genetic test result into improved care and health outcomes for an affected individual and their family.

Dingemans and colleagues have taken a careful and systematic approach to tackling many of these challenges in their analysis of 52 individuals with Zhu-Tokita-Takenouchi-Kim (ZTTK) syndrome [4]. Heterozygous variants in SON, a gene important in cell cycle progression and as an RNA splicing cofactor, were first described in 2016 by three independent groups [5,6,7].

The first major challenge that the authors tackled, was that for many recently described genetic conditions, there are scant publications delineating the genotypic and phenotype spectrum and natural history in detail, which makes it very difficult for a clinician to provide accurate prognostic information back to the family. To tackle this the authors took a comprehensive three pronged approach of (i) conducting a systematic literature search for all publications mentioning SON or ZTTK syndrome, (ii) recruiting additional patients through gene-matcher exchanges and inter-collegiate contacts, and (iii) providing a pathway for ongoing curation of phenotypic and genotypic data via the website Human Disease Webseries [8]. The end result was collation of comprehensive genotypic and phenotypic data on 52 individuals, including 17 which were not previously reported, the largest ZTTK cohort published to date.

The second challenge that the authors tackled was the heterogeneous way that clinical data can be presented and interpreted. They acknowledged that when clinical information was missing from a publication, this can either be because a phenotype was truly not present, or alternatively because it was just not asked about and/or reported. They assessed the likelihood that data was just not reported by assessing the ‘completeness’ of the publication. The end result was a more complete, up to date overview, which they presented in a helpful clinical table, using human phenotype ontology (HPO) terminology, as is recommended by F.A.I.R. (Findability, Accessibility, Interoperability, and Reusability) data management and stewardship principles [9]. They also tackled the rather subjective interpretation of facial dysmorphism, using quantitative facial phenotyping. This concluded that there was not a consistent characteristic facial gestalt for ZTTK, which meant that they emphasised the importance of clarifying a molecular diagnosis in all suspected affected individuals. They translated the findings from their systematic data collection into recommendations for surveillance of affected individuals.

The third challenge was the lack of natural history data for many rare genetic conditions [10]. Although not a prospective natural history study, the authors carefully analysed the (small) number of adult individuals which allowed them to make important clinical observations, for example that early-onset hypertension may be an association with this condition, and thus important to monitor for.

Fourth, recognising the often lack of clinical information on rare genetic conditions in ‘lay’ formats that can be shared with the affected individual and their family, they have provided updated information for families on the Human Disease Genes Webseries, including graphical representations of the clinical findings https://humandiseasegenes.nl/son/ [8]. When high-quality, easily understandable, information on rare disease is lacking, this is a major barrier to the provision of high -quality integrated care for the 300 million people living with rare diseases globally [11], and one of the major reasons for the lack of patient and caregiver satisfaction with the clinical care of people with rare diseases https://download2.eurordis.org/rbv/rare2030survey/reports/RARE2030_survey_public_report_en.pdf

Lastly, the authors took on the challenge of ascribing pathogenicity to previously unreported rare missense variants. Functional studies are not incorporated into the workflow of most diagnostic laboratories, which means that clinicians are often unable to reclassify a variant of uncertain clinical significance, leaving the family ‘in limbo’, unable to move to diagnostic or reproductive counselling certainty. It is in this environment that international variant curation efforts, such as ClinVAR [12], and dedicated government funding to ‘matchmake’ clinicians with a laboratory able to perform functional studies, should be applauded [13]. Although the majority of (49/52) reported affected individuals had ‘loss of function’ variants (e.g. frameshift, whole gene deletion, nonsense), or in-frame deletions, three individuals had a unique heterozygous missense variant. The authors were able to collect patient derived cells on one of these individuals, and applied a functional analysis which was unable to demonstrate dysregulation of splicing or downregulation of genes regulated by SON that loss of function variants have been linked to. They thus remain correctly cautious about ascribing pathogenicity to missense variants in SON currently.

The authors should be applauded for their comprehensive and very clinically practical guide to this recently described rare neurodevelopmental condition. This should be seen as a possible template for an approach for clinical research on other rare genetic conditions, following the initial delineation.