KBG syndrome: videoconferencing and use of artificial intelligence driven facial phenotyping in 25 new patients

Genetic variants in Ankyrin Repeat Domain 11 (ANKRD11) and deletions in 16q24.3 are known to cause KBG syndrome, a rare syndrome associated with craniofacial, intellectual, and neurobehavioral anomalies. We report 25 unpublished individuals from 22 families with molecularly confirmed diagnoses. Twelve individuals have de novo variants, three have inherited variants, and one is inherited from a parent with low-level mosaicism. The mode of inheritance was unknown for nine individuals. Twenty are truncating variants, and the remaining five are missense (three of which are found in one family). We present a protocol emphasizing the use of videoconference and artificial intelligence (AI) in collecting and analyzing data for this rare syndrome. A single clinician interviewed 25 individuals throughout eight countries. Participants’ medical records were reviewed, and data was uploaded to the Human Disease Gene website using Human Phenotype Ontology (HPO) terms. Photos of the participants were analyzed by the GestaltMatcher and DeepGestalt, Face2Gene platform (FDNA Inc, USA) algorithms. Within our cohort, common traits included short stature, macrodontia, anteverted nares, wide nasal bridge, wide nasal base, thick eyebrows, synophrys and hypertelorism. Behavioral issues and global developmental delays were widely present. Neurologic abnormalities including seizures and/or EEG abnormalities were common (44%), suggesting that early detection and seizure prophylaxis could be an important point of intervention. Almost a quarter (24%) were diagnosed with attention deficit hyperactivity disorder and 28% were diagnosed with autism spectrum disorder. Based on the data, we provide a set of recommendations regarding diagnostic and treatment approaches for KBG syndrome.


INTRODUCTION
KBG syndrome (OMIM:148050), first described by Herrmann et al. [1], is named after the surnames (K-B-G) of the first families reported with the syndrome. The original report described anomalies such as short stature, skeletal abnormalities, cognitive disability, and specific craniofacial dysmorphisms. Subsequent research has expanded the list of anomalies to include seizures, behavioral disturbances, congenital heart defects, and gastrointestinal issues [2][3][4][5][6][7][8]. Genetic variants in Ankyrin Repeat Domain 11 (ANKRD11) and deletions in 16q24.3 are known to cause KBG syndrome [9]. One author (GJL) was introduced to KBG syndrome by an original describer of the syndrome (John Opitz), and published a case report describing a 13-year-old boy with epilepsy, severe developmental delay, distinct facial features, and hand anomalies [10]. The very serious nature of his epilepsy and its subsequent negative impact on development was notable. In the present study, GJL met and interviewed 25 individuals with KBG syndrome to better characterize the disease and investigate the effects of epilepsy and other conditions on the trajectory of neurodevelopment in individuals with KBG syndrome. Additionally, a facial photograph could ideally be combined with medical records and variant prioritization efforts, after exome or genome sequencing, to accurately classify new pathogenic missense and other variants in rare syndromes. We assess the current state of two leading facial recognition software algorithms [11,12] and demonstrate the use of a variant prioritization approach, PEDIA, [13] that integrates phenotypic features, facial images, and exome data.

METHODS
Twenty-five individuals (11 females, 14 males) from 22 families throughout eight countries were interviewed via Zoom (version 5.2.0) by a single physician (GJL) over a 4-month period from February 2021 to June 2021. All interviews were conducted in English, with a translator used for one family whose primary language was Spanish. Interviews were approximately one to two hours long and consisted of structured questions and the physician's visual assessment of facial and limb phenotypic characteristics.
All patients interviewed were molecularly diagnosed with KBG syndrome and were self-referred or recruited via a private Facebook group created by the KBG Foundation. Genetic reports, medical records including imaging, and photos (facial and whole-body) were collected from families by email and compiled prior to the interviews. Photo consent was obtained. The height and weight at the time of videoconference was obtained via verbal report or documented from the most recent medical reports and growth charts.
All variants were annotated to the NM_013275.5 transcript in GrCh37/ hg19. Every reported anomaly was documented as a standardized Human Phenotype Ontology (HPO) term and compiled on the open-source Human Disease Genes (HDG) website series to promote international data sharing [14]. The presence of a trait or phenotype was documented when it was explicitly stated in the interview or found in the individual's medical records. Facial photos provided by the families and/or taken by the clinician during videoconferencing were loaded onto Face2Gene (version 20.1.4; FDNA Inc, USA) [11] and GestaltMatcher (version 1.0) through Bonn University [12]. DeepGestalt photos and phenotype data were uploaded on August 24, 2021 (at which time the GestaltMatcher algorithm was not available in Face2Gene). These programs use deep convolutional neural networks to build syndrome and patient classifiers, respectively.
Face2Gene (F2G) uses several different algorithms, including DeepGestalt, a facial phenotyping framework that measures the similarity between a patient and a specific genetic disorder. Its algorithm is trained using images of individuals from many different genetic syndromes. Once a photo is uploaded, the software provides a 'gestalt score', with a higher score indicating greater similarities in facial morphology to a specific disorder [11]. In addition to a photo, the physician can input relevant phenotypic features (e.g., anteverted nares, prominent nasal bridge) which are used to derive a 'feature score', an indicator of how well the clinical text seems to fit a specific diagnosis. The gestalt and feature scores, ranked high, medium, or low, are further combined to produce a list of the top 30 syndromes that the individual most closely matches. This "combined score" is based on an optimization of a test set proprietary to FDNA. The clinician then confirms the diagnosis as a "differential", "clinically diagnosed" or "molecularly diagnosed" (Fig. 1).
GestaltMatcher, an extension of DeepGestalt, quantifies the similarity in features between two patients with KBG, allowing for the identification of syndrome-specific genetic traits. Unlike DeepGestalt which quantifies similarities on a syndromic level, GestsaltMatcher quantifies similarities between images and returns a score of similarity for various individuals with one specified syndrome. We uploaded pictures of the 25 individuals with KBG syndrome and calculated the degree of overlap in facial features.
Furthermore, we used a variant prioritization approach, PEDIA (version 1.1) [13], which uses a facial photo, clinical features (HPO terms) and exome sequencing data as input. PEDIA integrates the score of each gene calculated from DeepGestalt, Case Annotations, as well as Disease Annotations (CADA) [15] and Combined Annotation-Dependent Depletion (CADD) [16]. DeepGestalt first derives the gestalt score of each gene, then a CADA score of each gene is calculated (https://cada.gene-talk.de/ webservice/) for feature analysis. Since only the disease-causing variant of each patient was known, we performed the exome simulation by inserting the disease-causing variants into a randomly selected exome from the 1000 Genomes Project [17] to obtain the genomic score. We then annotated the exome variants with a CADD score [16], using the highest CADD score among the variants of each gene. The PEDIA model was trained with the gestalt, CADA, and CADD scores and used to calculate likelihood scores for each gene for all 25 patients. These scores were sorted in descending order, with the top-1, top-10 and top-30 accuracy reported.
The annotation of variants in ClinVar for Supplementary Table 2 is described in further detail in Supplementary Methods.

Molecular findings
The variants occurred de novo in 12 individuals, were maternally inherited in Individuals K and L, and paternally inherited in individual O. One parent of affected Individual T, Individual U, showed a low level of mosaicism for the variant (with only 2 out of 298 sequencing reads for this variant found in her blood). Nine individuals had unknown modes of inheritance. A majority, 20, are truncating variants (frameshift or nonsense), and five are missense (with three of five belonging to the same family). Twenty-one distinct variants were identified (Table 1), with locations shown in Fig. 2 [18].
Truncating variants are classified by ACMG criteria [19] as: "PVS1 null variant (nonsense, frameshift) in a gene where loss of function is a known mechanism of disease." Some variants are classified as "PS2 De novo (both maternity and paternity confirmed) in a patient with the disease and no family history". One missense variants in our cohort (p. (Val586Met) was seen in a heterozygous control individual in the Genome Aggregation Database (Gno-mAD), thus calling into question its pathogenicity. It is also formally possible that the one individual in GnomAD might be mildly affected. The mother with this variant (individual M) has a very mild phenotype whereas her children (individuals K and L) have phenotypes more consistent with KBG syndrome. However, a recent preprint [20] demonstrated that some missense variants do impair ANKRD11 ability and/or stability, but that these variants mainly localize in the Repression Domain 2. Those authors also tested one variant in the Repression domain 1 (p.Leu509Pro), which turned out to have no effect on ANKRD11 stability or activity. The p.(Val586Met) variant of individuals K, L, and M also falls within the Repression Domain 1, and it has a borderline CADD score (23.9) and is not as highly conserved as the other missense variants. In addition, the affected nucleotides and corresponding amino acid are also not highly conserved when the sequence is aligned with other species. Per DeepGestalt, these individuals (K, L, M) did not have KBG syndrome listed in their top 30 differentials. Segregation analysis with the mother and sister of Individual M is not yet available. While the mother has very mild clinical features of KBG syndrome, the sister (aunt of Individuals K and L) is potentially reporting more severe symptoms. Ultimately, the pathogenicity of the variant (p.(Val586Met)) is still uncertain.
A different missense variant (p. Arg2536Gln) arose de novo and was initially classified as a variant of uncertain significance because it had not been previously reported. However, it has   been reclassified because of new information available: two additional patients carrying the variant. One is reported in Clinvar (https://www.ncbi.nlm.nih.gov/clinvar/variation/1012410/? new_evidence=false), a patient in whom the variant was maternally inherited (referred to as Individual Z in Supplementary Information), but who was unavailable for videoconferencing. In the other previously reported patient, the variant has arisen de novo and was classified as pathogenic [21]. Although a more extensive cosegregation of the patient reported in Clinvar is not available, since phenotypes characteristic of KBG syndrome are seen in three individuals possessing this variant, the variant is reclassified to likely pathogenic. Further details about these cases can be found in Supplemental Text and Case Summaries.
As of April 2022, there are 429 putative missense or nonframeshift deletion, substitution or insertion variants in ANKRD11 submitted to ClinVar [22], with many of these listed as variants of uncertain significance (Supplementary Table 2), with bioinformatic analyses providing a suggested consensus classification for each variant.
Median age of the 25 individuals was 11 years and average age was 15 years (range = 1-59). One comes from a consanguineous family, roughly half (n = 12) had a history of congenital abnormalities in the family, and eight had relatives with intellectual disabilities.
The parents of individuals B, D, T, and Y had histories of miscarriage. The variant was de novo for individual B, whereas the parent of individual T (Individual U) was mosaic for the missense variant (as noted above). The mother of individual Z has a history of several miscarriages early in pregnancy around six weeks of age. The inheritance pattern is unknown for individuals D and Y.
The parents in this study (M, P, U) generally had mild phenotypic features. Individual M, the mother of K and L, possessed some distinct facial traits (e.g., thick eyebrows, anteverted nares, broad nasal base), however, the overall constellation of features was not typical of KBG syndrome. She did not present with common features such as developmental delay, macrodontia, or short stature. Conversely, individual P, the father of O, presented with global developmental delay, macrodontia, and short stature among other common traits of KBG syndrome. Lastly, individual U, the mother of T, had mild facial features (e.g., synophrys, thick eyebrow, wide nasal bridge, prominent nasal tip) with speech delays and seizures in childhood.
The overall frequency of certain phenotypic features is shown in Table 2, and these are reviewed in further detail in the following sections.

Facial features
The photographs with permission for publication are shown in DeepGestalt results. KBG syndrome was recommended among the top 30 syndromes and ranked as the first (i.e., most likely) diagnosis for 28% (n = 7) of individuals, second for 40% (n = 10), and third or fourth for 12% (n = 3). Overall, 80% (n = 20) of patient's photos analyzed had KBG syndrome ranked in their topfive potential diagnoses out of the 30 possible suggested syndromes from among the 300+ syndromes currently recognized by the DeepGestalt algorithm. Among the 20 with KBG in the top-five rank, seven had a high gestalt score, 10 had medium gestalt, and three had low gestalt. Fourteen had a medium feature score, five had a low score, and one was unranked for features of KBG (see Supplementary Table 3). Individuals B, F, and J initially submitted photos where they were wearing glasses. After analyzing photos without glasses, the ranking of KBG surprisingly dropped from two to six for individual B and from two to three for individual J. Ranking did not change for individual F. While KBG ranking fluctuated, the gestalt and feature levels did not change between the photos with and without glasses for any of the three individuals.     Five individuals (K, L, M, P, U) did not have KBG syndrome appear as a differential diagnosis out of 30. First ranked diagnoses instead included Cornelia de Lange, Williams-Beuren, Rubinstein-Taybi, Angelman, and mucopolysaccharidosis. Notably, Individual P was 55-60 years old at the time of the videoconference whereas Individual U was 30-35 years old, and both of them initially submitted pictures of themselves around those ages. These ages fall above our median age of 11 years and the age at which most individuals are diagnosed with KBG syndrome. DeepGestalt relies on the photos that it is trained on, so older age photos may not perform as well. Additionally, individual U has very low-level mosaicism for this variant, potentially resulting in lower phenotypic expression of facial features. The other three individuals who were unranked (K, L, and M) are all from the same family and possess the same missense variant (Table 1) with questionable pathogenicity.
Variant prioritization with facial images. With PEDIA score, the disease-causing gene ANKRD11 is ranked at the first place in 18 out of 25 (top-1 accuracy: 72%). When looking at the top-10 genes, ANKRD11 is listed in the top-10 genes in 22 out of 25 (top-10 accuracy: 88%). All have ANKRD11 in their top-30 genes.
Cognition and neurologic features. Eight reported an intelligence quotient (IQ) score, with a mean of 73 ± 4.84 (range = 64-80) as measured by the Weschler Intelligence Scale (3rd to 5th edition). A majority, 68% are considered mildly to moderately intellectually disabled based on level of functioning. Global developmental delays prior to 5 years were seen in 68% (n = 17), with nine being classified as mild. Median age of crawling onset was 12 months (range = 9-24) (n = 8), walking onset 22 months (range = 12.5-36) (n = 10), and speech onset 30 months (range = 19-36) (n = 6). Selective mutism and absent speech were observed in three individuals. Common types of seizures reported included myoclonic, tonicclonic, and absence with no specific type predominating [23]. Electroencephalogram (EEG) abnormalities were documented in three of 11 individuals with seizures. According to maternal report, Individual E was meeting speech and motor milestones until the onset of myoclonic seizures, complex partial seizures, and verbal tonic seizures with respiratory distress around 0.5-2 years of age. Similarly, individuals H, K, R, S, T, U, X, and Y reported histories of various types of seizures and concurrent speech and motor delays. Brain abnormalities detected on magnetic resonance imaging (MRI) included pineal cyst, arachnoid cyst, choroid plexus cyst, subdural hemorrhage, and small pituitary gland.
Abnormal mood included abnormal emotion or affect, depression, and/or anxiety, self-injurious behavior including self-biting.
Individuals E, O, Q, and R report absent or high pain threshold. O has a history of a fractured foot and a dislocated kneecap with bone scans showing normal density. Impaired tactile sensation was reported in two individuals (M,S).
Ear, nose, throat (ENT) and vision. Six had chronic otitis media, with five of six having concurrent hearing impairment. Those experiencing chronic otitis media likewise had a preauricular pit, abnormal or blocked Eustachian tubes, abnormality of the tympanic membrane, enlarged vestibular aqueduct, choanal atresia, and increased size of nasopharyngeal adenoids. Hearing loss and recurrent infections including sinus, chronic ear, and upper respiratory infections were present in four individuals (O, P, Q, Y). Of the six with palatal anomalies, four had difficulties feeding. Fig. 4 GestaltMatcher results. Sub-cluster P, J, F, M present with synophrys and wide noses. Sub-cluster O, H, R, Y, V, G, I present with thick eyebrows, prominent/broad nasal tips, macrodontia, triangular faces and pointed chins. Sub-cluster Q, S, D, E present with anteverted nares, broad nasal tips, and macrodontia. Link: https://db.gestaltmatcher.org/; individual links to each patient in Supplemental Text. Note: Individual E did not consent to having their photo published, however, a frontal photo was input into the GestaltMatcher and DeepGestalt algorithms.
Skeletal features. Of note, individual A was diagnosed with osteopenia, and later osteoporosis, at 15-20 years with low bone mineral densitometry in the lumbar spine, hip, and femoral neck. An x-ray of his left hand and wrist was performed which revealed physeal closure of the bones, excluding delayed bone maturation. Individual S has visible sacral dimple and was referred to neurology for gait disturbance and urinary incontinency. MRI of her lumbar spine revealed a tethered spinal cord.
Cardiovascular features. Cardiac abnormalities were seen in approximately half the participants and while many resolved without the need for surgical intervention, individual K had Tetralogy of Fallot with pulmonary valve-sparing surgical repair at~3-6 months of age. Individual T had mitral valve repair at around one year of age.
Gastrointestinal features. Participants F, M, S, T, U had presumed diagnoses of abdominal migraines, characterized by stomach pain, nausea, and vomiting. In F, the abdominal migraines were accompanied by cyclic vomiting syndrome. Reports described her episode as significant pain causing writhing with soft, nontender abdomen normal bowel sounds on examination.
Endocrinology, metabolism, and immune system function. Short stature is a common phenotype in those with KBG syndrome with up to 66% below the 10th centile in height [5]. Individuals H, J, and O were administered growth hormone. J was born with a length below 1st centile and weight at 57th centile. After receiving somatropin injections from 3.5 years to 5.7 years of age, his height is at the 13th centile and weight is at 24th centile. O was given growth hormone from approximately 6 years to 11 with positive improvement in weight (11th percentile at birth and is now at 45th percentile). Efficacy of hormone supplementation is unknown for H. Reports of precocious puberty, immunodeficiency, recurring infections, allergies are also common.
Urogenital features. Urogenital disorders were seen in 48% (n = 12) of individuals, with seven being female and five being male. Of note, four males were diagnosed with cryptorchidism. Other diagnoses included abnormalities of the urethra and/or bladder, recurrent urinary tract infection, pollakiuria, polyuria, and enuresis.

DISCUSSION
We present 25 patients from 22 families with KBG syndrome, molecularly confirmed by identification of variants in ANKRD11. Our approach emphasizes data sharing and capitalizes on the increased use and security of videoconferencing technology, allowing access to participants outside of the United States, broadening the generalizability of our results.
Variants in ANKRD11 are linked to specific facial dysmorphologies; however, the disorder can be difficult to diagnose on facial phenotype alone. While some have a constellation of facial features typical of KBG syndrome, others may look different. This is reinforced by the presence of three different clusters of similar facial characteristics within our cohort detected by GestaltMatcher. There are clear phenotypic overlaps with other genetic syndromes, most notably, Cornelia de Lange Syndrome (CdLS) [24]. CdLS was listed as a differential diagnosis on DeepGestalt for several KBG syndrome photos. The ability of F2G to identify CdLS is 87%, compared with the experts' average of 77%. When additional photographs were added to the system for increased machine learning, the detection rate of the system increased to 94% [25].
GestaltMatcher identified KBG syndrome as top-10 in 60% of individuals and top-30 for 84% while DeepGestalt identified KBG as top-5 in 80% of cases. The latter "missed" five individuals with molecularly diagnosed KBG syndrome, pointing to the need for greater data collection and training, although as previously noted, three individuals (K, L, and M) are all from the same family, and possess the same missense variant with uncertain pathogenicity.
The PEDIA approach identified KBG syndrome as top-1 rank in 72% of individuals. It outperformed DeepGestalt, which identified 28% with top-1 rank. We envision that an approach integrating facial, feature, and exome analysis could be integrated into future diagnostic pipelines.
We speculate the craniofacial abnormalities and inner ear malformations seen in KBG patients are tied to the high rates of recurrent sinus infection and conductive hearing loss [5], seen in individuals F, H, and U. For example, F and H verbally reported malformed sinus and ear canals, whereas U received a CT of her paranasal sinuses showing right posterior choanal stenosis. However, 32% experienced conductive or sensorineural hearing loss without sinus infection, thus more evidence is necessary to establish correlation. A majority (83%) were diagnosed with chronic otitis media and had concurrent hearing loss-indicating a more likely correlation.
While ANKRD11 variants have been linked to autistm spectrum disorder [26,27], none of the interviewed children appeared to have a severe form of autism as they were interactive, social, and maintained eye contact. 28% had been given an ASD diagnosis by previous providers. Quantitative, longitudinal history studies using rating scales are warranted to elucidate how ASD symptoms manifest in KBG syndrome, independent of degree of intellectual disability.
Reduced pain sensation and impaired tactile sensation are previously unrecognized features of KBG syndrome, requiring further investigation. It is not clear whether reduced sensation is due to peripheral or central nervous system impairment. Migraines and abdominal migraines, the latter of which is not a well-known phenomenon, were novel findings in 24% and 20% of our cohort, respectively. The overall prevalence of abdominal migraines in childhood ranges from 2.4 to 4.1% and is more common in females, but the syndrome can be under-recognized in the population [28].
KBG syndrome is associated with cardiac anomalies [5][6][7] and diagnoses have been made on their detection. Individual B had a persistent left superior vena cava detected at 20 weeks on fetal ultrasound, which prompted karyotyping. Unfortunately, no further genetic work-up was done until five years of age when a KBG diagnosis was made by exome sequencing. Antenatal ultrasound may play a safe and non-invasive role in the detection of KBG syndrome and improve prenatal diagnosis and counseling.
We speculate age of onset of seizures may be inversely linked to severity of developmental delay (Supplementary Text). Systematic and thorough descriptions and reporting of EEG abnormalities can guide physicians in prompt diagnosis. Obtaining a baseline EEG is likely warranted since our data seem to suggest the possibility that the trajectory of those who have seizures compared to those who do not seems to differ, with the latter showing better outcomes. While an ambulatory EEG for 24-72 h is ideal, a routine EEG for 30 min to 1 h could suffice. Future work should include a natural history study assessing age of onset and future levels of overall functioning, and convening an international summit of experts to develop consensus structured treatment guidelines for KBG syndrome.
Overall, the extent and variety of reported deficiencies in KBG syndrome patients could be attributed to the role of ANKRD11 as a chromatin regulator. Since ANKRD11 interacts with several key proteins of chromatin remodeling complexes, such as histone deacetylases and acetyltransferases, nuclear co-receptors, etc. and regulates global gene expression [29], it is unsurprising that mouse ANKRD11 regulates development and/or functioning of multiple tissues [29][30][31], and KBG syndrome patients report systemic phenotypes affecting multiple organs. It is unknown why different KBG syndrome patients tend to have variable number and severity of phenotypes and co-morbidities, although this is likely modified by different genetic backgrounds, environments, and some level of stochasticity.
Some limitations for the present study include the barriers that exist for those who are not familiar with videoconferencing technology. Participants were recruited primarily from referrals from a KBG Foundation Facebook group, further limiting participation to those adept in technology. Examination of stature and teeth morphology was limited over videoconference.

CONCLUSIONS AND TREATMENT RECOMMENDATIONS
• Early intervention with physical, occupational, and speech therapies is recommended. Anecdotal reports from families indicate that a frequency of at least once weekly is likely ideal. Children with ANKRD11 variants should undergo baseline auditory screening to rule out hearing defects that might impede speech development.
• There is evidence for the utility of growth hormone treatment for those with short stature (under their target range). Systematic study is required for formal guidelines and recommendations.
• High rates of seizures point at the possible utility of EEG screening upon diagnosis with regular monitoring by a neurologist. Further research is warranted to justify EEG screening, though other rare diseases with a high prevalence of seizures do have formal recommendations for baseline EEG screening, such as Tuberous Sclerosis [32].
• Patients may benefit from cardiac screening (including echocardiography) upon diagnosis.
• Chronic otitis media with hearing loss is a frequent finding. More research is needed to investigate whether aggressive antibiotic treatment could prevent hearing loss.
• Future research and clinical efforts should include more study of GI symptoms (e.g. abdominal migraines) due to their increased prevalence.
• Artificial intelligence-assisted facial applications can play a role in reducing missed diagnoses, given the often mild cognitive deficits and subtle dysmorphic features of KBG syndrome. Combining data from AI and patient registries can optimize diagnosis and help develop guidelines and treatment recommendations.

DATA AVAILABILITY
Data generated and analyzed during this study can be found within the published article and its supplementary files. Additional data are available from the corresponding author on reasonable request. The exome sequencing data were generated as part of clinical testing, so the underlying raw data are not consented for deposition to a public database.