Introduction

Intellectual disability (ID) can represent an important manifestation in several genetic diseases, in addition to recognizable craniofacial dysmorphisms and congenital anomalies. Among ID syndromes, the developmental disorders of chromatin remodeling (DDCRs) represent a group of malformation disorders characterized by peculiar face and variable cognitive impairment, which are due to an abnormal chromatin remodeling. Variants of the genes encoding for regulatory enzymes of histones modifications (acetylation, methylation, phosphorilation) have been found to cause a subgroup of DDCRs, such as Coffin-Lowry (CLS, MIM#303600) [1], Kabuki (KS, MIM#300867, 602113) [2], Koolen-De Vries (KDVS, MIM#610443) [3], Kleefstra (KLEFS1, MIM#610253) [4], Rubinstein-Taybi (RTS, MIM#180849) [5] and the autosomal dominant clinical spectrum of Ohdo Syndrome (OS, MIM#603736, 603736) [6]. These syndromes are defined by specific craniofacial dysmorphisms, including abnormal orbital region, nose and mouth conformation, which are fundamental to address the diagnosis. In this process, clinicians can interrogate computer aids. DeepGestalt is a technology behind Face2Gene (FDNA Inc., Boston, MA, USA), a suite of phenotyping application, which facilitates genetic evaluation, processing patients images and comparing them with model images of known syndromes [7]. We conducted three experiments, two multiclass comparison, which analyzed facial images of affected and unaffected individuals, and a face-crop study, which examined the whole face and cropped the photos of diseased and of unaffected subjects in 6 principal facial regions (eyes, nose, mouth, upper face and lower face). In the two comparison experiments, we analyzed respectively a total of 120 affected subjects (6 cohorts, 20 total affected cases), using 2 dimensional (2D) frontal facial photographs, and compared them with 60 unaffected controls (6 cohorts, 10 cases for each) (experiment 1) and 60 (6 cohorts, 10 cases for each) subjects affected by other ID syndromes with distinctive dysmorphic features (experiment 2). To the best of our knowledge, this is the first study focused on disorders caused by mutations in the histones modifiers. We tested the diagnostic sensibility of the DeepGestalt technology in the recognition of the entire face and single facial districts anomalies, identifying peculiar and recurrent dysmorphisms, especially in some of the selected syndromes. At the same time, we highlighted the practical utility of informatics aids use in the diagnostic route of DDCRs and other genetic disorders.

Materials and methods

FDNA multiclass comparison

Face2Gene is a recent dysmorphology application suite which is able to recognize facial patterns of known malformation syndromes from 2D facial photographs and to combine them with clinical data (anthropometric parameters, clinical signs). It exploits deep learning algorithms, building syndrome-specific computational-based classifiers (syndrome gestalts), converting a patient photo into de-identified mathematical facial descriptors, and comparing the patient facial descriptor to syndrome gestalt to quantify similarity (gestalt scores). The final result is a prioritized list of possible matching syndromes.

In this study, a total of 180 frontal images (12 cohorts) were analyzed. Of these, 120 belonged to molecularly proven affected subjects (age 2–12 years, mostly Caucasian and few Asiatic), deriving from clinical activity of the involved Institution, according to ethical guidelines currently applied in Italy and Helsinki Declaration, and from medical literature. Institutional Review Board approval was therefore not required. Written informed consent for photographs collection and use was obtained. Every image has been anonymized. In the first comparison experiment (experiment 1), we analyzed a sample consisting of 6 cohorts representative of 6 different syndromes related to alterations of the histones modifiers: CLS, KS, KDVS, KLEFS1, RTS and OS. We also considered 6 cohorts (60 cases) of unaffected controls, matched by age and gender. An additional cohort including other ID syndromes with recognizable craniofacial dysmorphisms (OIDS cohort, 6 cohorts, 60 total cases) was compared with the main study cohort (Intellectual Disability syndromes related to defects of the histones modifiers, IDSHM, 6 cohorts, 60 total cases) and with unaffected subjects (60 total cases) (experiment 2). The OIDS cohort comprised Coffin-Siris (CSS, MIM#135900), Cohen (COH1, MIM#216550), Cornelia de Lange (CDLS1, MIM#122470), Down (DCR, Down syndrome Chromosome Region Included, MIM#190685), Pitt-Hopkins (PTHS, MIM#610954) and Smith-Magenis (SMS, MIM#182290) syndrome. Selected images belonged to pediatric (age 2–12 years), mostly Caucasian patients. To perform all comparisons, we used the RESEARCH application of the Face2Gene tool, which allows the use of the technology in a controllable environment [8]. This application uses holdout cross-validation splits to compare cohorts of images. In each split, 50% of the images are randomly selected to train a classifier (train set) and 50% for validation (validation set). The process is repeated 10 times, each time for different random splits. The classification results are computed on the validation set and reported both numerically and graphically. An average of the area under the curve (AUC) of each of the 10 results is computed and the standard deviation is reported, representing main results. Aggregated results are obtained as follows: a score distribution curve and a receiver-operating-characteristic (ROC) curve for aggregated results for each photo is used in the validation set. To measure the statistical significance of the binary comparisons, a permutation test by measuring the distribution of the validation-set accuracy statistic under the null hypothesis is applied. The system randomly permutates and trains models 1000 times, then tests the models on the validation set to get new AUCs. From the distribution of AUCs, it then calculates the one-sided p-value for the original AUC value. Regarding the values in the confusion matrix for three or more cohorts, a multi-class experiment is conducted using all cohorts and a confusion matrix is computed. Similar to the binary comparisons, a holdout cross-validation splits method is used where in each split 50% are randomly selected for training and 50% for validation. The values presented are the mean accuracy per cohort over all splits. The results shown in the matrix are normalized by dividing the results of each row by the sum of that row.

In a first phase, we conducted a binary and multiclass experiment, comparing each affected cohort with each other and with unaffected controls. Composite photos of each cohort were computed (Figs. 1a, and 2a).

Fig. 1
figure 1

Multiclass comparison analysis. a Confusion matrix showing the True Positives (TP) values, which are contained in the diagonal. The other rates represent errors (False Positives, FP and False Negatives, FN). The images of actual affected and unaffected individuals, obtained from the composite photo, are vertically positioned (actual). The expected syndromes masks generated by the FDNA system are horizontally showed (predicted). At the top on the right is represented the Composite photos of affected and unaffected cohorts (cs. = cases, im. = images). b An example of binary comparison between KS and RTS. Note the visibly separated curves, demonstrating that the two conditions are well distinguished. Alongside, the ROC curve with the high AUC value (1.00). c Binary comparison between KS and unaffected cohort. Note the not overlapping curves, corresponding to a considerable capacity of the tested platform to differentiate them

Fig. 2
figure 2

Comparison experiment 2. a Confusion matrix and composite photos of the three analyzed cohort (IDSHM, OIDS and unaffected controls). b-c Binary comparison. Note the scanty overlap when the affected cohort are compared with unaffected controls. A modest overlap can be identified in the IDSHM vs. OIDS comparison

Face-crop comparative analysis

The analysis of a single facial region (upper face, eyes, nose, lower face, mouth) and of all face of the 6 affected and unaffected cohorts was subsequently performed. Analogously to the above, each affected cohort was compared with other and with each unaffected cohort. The obtained results were expressed in terms of AUC and ROC (Fig. 3a–e).

Fig. 3
figure 3

Face-crop analysis. a Eyes, b nose, c mouth, d upper, and e lower face analyzed regions. For each area, a comparison histogram with AUC values of affected cohorts (on the left) and unaffected controls (on the right) is exhibited. Note the high analytical significance of AUC values, especially in upper face analysis

Results

Multiclass and binary comparisons

In the comparison experiment 1, the confusion matrix demonstrated TP values comprised between 0.99 (KDVS) and 0.10 (unaffected cohort 1), with a mean accuracy of 54.8%, standard deviation of 12.25% and random chance for comparison of 11.05% (Fig. 1a). Regarding the comparison between affected cohorts, the AUC and ROC value resulted optimal (1.00) for the following: CLS vs. KS, CLS vs. KDVS, CLS vs. RTS, KLEFS1 vs. RTS, KS vs. KDVS, KS vs. RTS, KDVS vs. OS, OS vs. KS. Relatively to the binary comparison between affected vs. unaffected cohorts, significative results (AUC 1.00) were obtained for CLS, KS, KDVS, OS with a p-value < 0.001.

In the comparison experiment 2, TP values resulted comprised between 0.83 (unaffected controls) and 0.65 (OIDS), with mean accuracy of 73.11, standard deviation of 6.62% and random chance comparison of 33..33% (Fig. 2a). The binary analysis recorded an AUC of 0.832 and p-value of 0.000 in the IDSHM vs. OIDS comparison, while the comparison analysis of the two IDSHM and OIDS affected cohorts vs. unaffected controls showed respectively an AUC of 0.965 and 0.927, with a p-value of 0.000 in both studies (Fig. 2b–d).

Face-crop comparative analysis

High values of AUC were obtained for the periorbital region of all conditions when compared each other and in particular in CLS, OS and KLEFS1. KS and RTS presented high scores in comparison to other diseases but seemed to reach a lower value when compared with each other. The nose resulted to be well recognizable in RTS, KS and CLS while the mouth was most distinguishable in CLS, RTS, OS and KS. The upper face region analysis appeared consistent in all syndromes, in particular in CLS, KDVS, KS and RTS, as well as the study of all face resulted in a high overlap in all diseases. Regarding comparative analysis with unaffected controls, the eyes conformation turned out to be a significant dysmorphic sign in CLS, OS, KDVS and KS while the nasal region gave optimal values in the first three conditions. The mouth was considerable in CLS, OS, KDVS and KS, as well as CLS and OS obtained the major score for all facial regions analysis (Fig. 3a–e).

Discussion

Chromatinopathies represent a wide group of genetically heterogeneous ID syndromes caused by a defective chromatin remodeling, which can be frequently encountered in clinical practice. Mutations of genes encoding for the enzymes involved in histones modifications, such as histone acetylases (HATs), deacetylases (HDACs), methyltransferases (HMTs), demethylases (HDMs), kinases and phosphatases, define a molecular subcategory of DDCRs. Major facial dysmorphisms, including eyes (hypertelorism, blepharophimosis, blepharoptosis, abnormal ocular conformation), nose (broad, beaked, bulbous, short nose, anteverted alae nasi) and mouth (thin upper or tented lip, thick lips, full everted lower lip) anomalies, are frequently identified, representing necessary clinical handles for a correct diagnosis. Geneticists can be supported in their clinical activity by valid informatics aids. Recently the Face2Gene platform has been introduced to facilitate the diagnostic process. This is a dysmorphology suite, which compares facial syndromes classifiers with patients photos uploaded by the user. Other authors already experimented it, mainly studying monogenic syndromes, metabolic disorders and chromosomopathies [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24].

In this study, we focused our attention on the DeepGestalt technology analysis of DDCRs related to mutations of the genes codifying for the histones enzymes, constituting, for our knowledge, a novel research topic. The purpose was to verify the principal facial areas to consider for diagnosis among the different syndromes and in comparison with unaffected population. Then, we decided to perform three distinct experiments, two multiclass and subsequently a face-crop analysis. The first two showed significative results in the comparison of the affected cohorts between them and with unaffected controls, indicating that the single syndromes were well recognizable by the platform and that the FDNA system was able to differentiate them. Indeed, confusion matrix highlighted high values of TP for the affected cohorts. Binary comparison showed non-overlapping score distribution curves and high AUC values for the ROC curves in syndromes comparison, with low p-value (Figs 1 and 2). Regarding the affected cohorts comparison, CLS appeared to be distinguishable, in particular by KS, KDVS and RTS while KS by KDVS, RTS and OS (Fig. 1b). These findings were in line with the distinctive dysmorphisms of the analyzed syndromes. Indeed, the CLS face is defined by coarse features, downslanting palpebral fissures, hypertelorism, short and broad nose with thick alae nasi and anteverted nares, large open thick mouth and everted lower lip, and then results discernible by peculiar dysmorphisms of KS, consisting in long palpebral fissures with eversion of the lateral portion of the lower eyelid, ptosis, arched and broad eyebrows with sparse lateral third, epicanthal folds, short columella and open mouth with tented upper lip. Furthermore, CLS, KS and KLEFS1 were differentiated by RTS, which presents distinctive beaked nose conformation with prominent and short columella and small mouth, as well as by OS, whose ocular impairment with blepharophimosis and ptosis constitutes the most noticeable dysmorphological sign. KDVS resulted another clinical entity to discriminate principally from KS and OS. Moreover, CLS, KS, KDVS, OS were not confused with controls, documenting their marked facial gestalt.

The results of the second comparison experiment (experiment 2), confirmed the presence of distinctive craniofacial dysmorphisms in both syndromic cohorts, that were well distinguished by unaffected controls. This was demonstrated by the remarkable division between the two score distribution curves. A greater overlap between the two curves was noted in the IDSHM vs. OIDS comparison, most probably because the OIDS cohort included some syndromes sharing with IDSHM similar craniofacial features, such as Down syndrome, or molecular determinants, comprising Coffin-Siris syndrome. Indeed, we know that CSS is a well recognizable DDCRs, which is caused by mutations in components of the chromatin remodeler SWI/SNF complex. Smith-Magenis syndrome, which we inserted in the OIDS cohort, can be also considered a chromatinopathy, as recently postulated [25].

The analysis of single facial details obtained by the face-crop experiment, resulted consistent for CLS and KS (eyes, nose and mouth), OS (eyes and mouth) and RTS (nose and mouth), in the comparison between affected and unaffected cohorts (Fig. 3a–e). In particular, the distinctive orbital region of CLS, OS and KS was differentiated by unaffected individuals, as demonstrated by binary analysis (Fig. 3a). Relatively to controls comparison, mildly inferior values than CLS, OS and KDVS, were unexpectedly registered for the nose analysis in RTS, while high values were recorded for KDVS, whose nasal region is defined by characteristic tubular or pear-shaped nose and bulbous nasal tip. A minor sensibility in the recognition of some distinctive facial details, such as the nose in RTS, could be related to the quality or framing of the chosen image.

In conclusion, this study represented the first dysmorphology computer-assisted analysis of the subgroup of chromatinopathies due to mutations of the histones remodelers, which are frequently encountered in clinical genetics practice. Our first multiclass experiment, revealed a remarkable capacity of the DeepGestalt technology of recognize and differentiate the analyzed syndromes in terms of high AUC and low p-values, suggesting a reliability of the platform. Similar considerations can be applied to the second experiment, the face-crop analysis, which revealed the single facial details of the considered syndromes, according to their phenotypic characteristics. Interestingly, all syndromes obtained high values for the upper and global face analysis, corroborating that this group of syndromes is characterized by an unusual face. Furthermore, optimal results were obtained especially for CLS, KS and OS in affected and unaffected cohorts comparison, indicating their marked dysmorphic features, including respectively coarse face, Kabuki-mask resemblance and typical association of blepharophimosis, blepharoptosis and epicanthus. The single facial details such as the nasal area resulted in being an important clinical clue, especially for KDVS and RTS, as predictable, while KLEFS1 appeared to have more dysmorphic signs in the nasal and ocular areas than in the mouth region (Fig. 3). In addition, the bottom half of the face seemed to be a significative clinical clue in KDVS (Fig. 3). Interestingly, it seemed that most of dysmorphisms was localized in orbital and labial region, suggesting to clinical geneticists a careful observation of this facial area. Then we could speculate that mutations of Chromatin-Marking System (CMS) could alter the development of these two facial districts during embryogenesis. This could represent an interesting aspect for further investigate in future studies.

Concluding, our study could suggest to clinicians to utilize the computer-assisted facial analysis in confirming a diagnostic suspect for syndromic diseases, such as chromatinopathies, which result to be defined by distinctive abnormal craniofacial contour. However, the diagnostic limits of informatics systems, strictly linked to some technical variables (framing or quality of the analyzed image for example) or to other factors related to patients (age or ethnicity), must be considered. We hope that these technologies will reach in the near future a greater reliability, to more support clinical activity.