Introduction

Since 30–40% of all patients with genetic disorders show characteristic craniofacial features [1], a detailed examination of the facial gestalt is important when phenotyping patients with suspected genetic disorders. However, the gestalt examination depends on the expertise of the examining physician [2]. As an additional approach, modern computer-aided image analysis programmes such as Face2Gene (FDNA Inc., Boston, USA) offer machine-learning-based algorithms combining facial recognition software with clinical knowledge to evaluate two-dimensional patient images (i.e. photographs) [3, 4]. Basel-Vanagaite et al. [3] have been the first to show that this may be an effective tool in the diagnosis of rare genetic disorders. In molecularly proven cases of Cornelia-de-Lange syndrome the approach has a diagnostic accuracy similar to clinicians [3]. Comparable results have been reported in other disorders with facial dysmorphisms [2, 5,6,7,8,9,10,11,12,13,14,15,16,17,18] suggesting the technology complements conventional phenotyping, and improves clinical diagnosis and laboratory analysis [19].

Microcephaly, short stature and limb abnormalities syndrome (MISSLA) is a rare dysmorphic autosomal recessive disorder caused by variants of the Downstream neighbour of SON (DONSON) [20,21,22,23,24]. As a component of the replisome, DONSON plays an important function in the stabilization of the replication fork [22, 23, 25]. DONSON promotes the activation of intra-S-phase and G2/M-phase checkpoints for the correction of replication errors. DONSON, thus, protects the nucleus against replication stress. The rate of cell proliferation is slowed by a reduced or aberrant DONSON expression [22]. This explains some of the clinical signs, such as severe intrauterine and postnatal growth retardation or severe microcephaly. MISSLA includes anomalies of the extremities, such as mesomelia of all four extremities or radial abnormalities, as well as a craniofacial dysmorphism, which is characterized by a short palpebral fissure, microstomia, micrognathia, and a broad nose with a prominent bridge. Affected individuals often die perinatally due to incomplete lung development. A neuronal function was implied by altered movement patterns of Caenorhabditis elegans mutants featuring DONSON variants [26]. Since microcephaly, pre- and postnatal growth retardation, and skeletal abnormalities, especially radial ray defects, occur in both diseases some MISSLA patients have originally been diagnosed with Fanconi anaemia (FA) [21, 24]. FA is heterogeneous with causative variants in 21 genes that are—just like DONSON—involved in the repair of DNA damage [27]. The cellular phenotype of hypersensitivity to DNA interstrand crosslinking agents is also observed in MISSLA patients suggesting functional overlap of the involved genes [24, 28]. Notably, DONSON is known to act in the ATM- and RAD3-related (ATR)-dependent pathway [22] of which the FA proteins are substrates [29].

Features of FA not yet described in MISSLA include malformations of the heart and kidneys, and progressive bone marrow failure, as well as a significantly increased risk of malignancy [28, 30,31,32,33]. Characteristic facial features of FA are microcephaly and, in some cases, small eyes/microphthalmia [30].

Here, we present the clinical manifestations of two MISSLA siblings featuring a novel likely pathogenic DONSON variant. We also test the ability of Face2Gene RESEARCH to distinguish MISSLA from FA.

Patients and methods

Image analysis

We developed classifiers for the gestalt of MISSLA syndrome and FA analogous to the DeepGestalt technology. For the development and evaluation of classifiers based solely on the image data set described below, we used the Face2Gene RESEARCH application (www.face2gene.com). The underlying methodology of DeepGestalt has been recently described at length [4].

Acquisition of MISSLA and FA cases

Pubmed and the Online Mendelian Inheritance in Man (OMIM) database were searched for images of MISSLA and FA patients suitable for this study.

Pubmed was searched with the following medical subject headings (meSH terms) (including the different names of a syndrome caused by DONSON variants):

(“Fanconi Anemia”[Mesh]) AND (“Case Reports”[Publication Type] OR “Review”[Publication Type])

(DONSON[Title] OR MISSLA[Title] OR MIMIS[Title] OR MMS[Title] OR Microcephaly Micromelia[Title] OR Microcephaly Short Stature Limb Abnormality[Title] OR Ives-Houston[Title]) AND (Case Reports[Publication Type] OR Review[Publication Type])

Nineteen suitable FA patient photos of 18 patients were found from 12 case reports and reviews (Suppl. Table 1) via Pubmed. Two images of two FA patients from Wiedemanns Atlas klinischer Syndrome [34] were added to build the final FA cohort.

The query yielded no suitable MISSLA patient images. Therefore, images were taken from the publications listed in the MISSLA-related OMIM entries yielding eight images of unrelated individuals. We added one image of the patient originally published by Schulz et al. [24] now depicted at the age of 2 years (Suppl. Fig. 1). The two images of the siblings reported here were not recognized by Face2Gene due to image quality reasons (closed eyes, non-frontal positioning of the head). The parents gave written informed consent for the publication of their children’s photos and data.

Patient 1

Patient 1 (Fig. 1a, c, Table 1) was a male child, born to healthy non-consanguineous parents of German descent at 38 weeks gestation. Prenatal ultrasound showed short long bones and ribs, club feet, thickened nuchal fold, and cleft palate. He had two healthy older sisters. He died directly after birth due to an untreatable lung hypoplasia. Birth length was 46 cm (−2.3 SD), weight 1960 g (−2.7 SD), OFC was 29 cm (−5.6 SD). Hypospadias was noted. Facial anomalies include broad nose with hypoplastic nostrils, microstomia, retrognathia, dysplastic ears, full cheeks, and short neck. Parents declined a post mortem examination, but a radiograph suggested a diaphragmatic hernia.

Fig. 1
figure 1

Babygrams of a patients 1 and b 2 note bilateral shortened radius and ulna, hypoplasia of thumbs and club feet in patient 1, as well as bilateral radial and thumb aplasia and club feet in patient 2. Photos of c patients 1 and d 2, note the facial features of both newborns: short palpebral fissures, a broad nose with hypoplastic nostrils, small mouth, and retrognathia, characteristic for MISSLA

Table 1 Clinical features of published MISSLA patients

Patient 2

The younger sister of patient 1 (Fig. 1b, d; Table 1) featured severe intrauterine growth retardation, with shortened thin long bones and ribs and club feet. A thickened nuchal fold and a cleft palate were present. She was born at 38 weeks gestation. Birth length was 40 cm (−4.9 SD), weight 1590 g (−3.6 SD), OFC was 26.5 cm (−10.6 SD). She died due to untreatable lung hypoplasia, too. Notably, observations were bilateral aplasia of the radius and thumb with bowed ulnas, cutaneous syndactyly of the fingers II–V and toes II–IV and a sandal gap. The following facial features were notable: short and downslanted palpebral fissure, micrognathia, microstomia, broad nose with hypoplastic nostrils, short neck, and dysplastic ears. Investigation by ultrasound, however, showed atresia of the right choana and both external auditory canals. Parents again declined a post mortem examination.

Analysis by a 1M oligo aCGH (Agilent Technologies) was unremarkable. The siblings’ phenotype prompted Sanger sequencing of the protein coding part and flanking bases of DONSON. This revealed compound heterozygosity for the two variants c.[661T>C];[1433C>T] in exons 4 and 9, respectively (Suppl. Fig. 2) in patient 2. The parents were confirmed to be heterozygotes. No material from patient 1 was available for testing (all variants described in this paper refer to the DONSON transcript NM_017613.3 positioned at NC_000021.8:g.34949859-34961267, exons and introns are numbered as with Reynolds et al. (2017) and Evrony et. al (2017)).

The variant c.661T>C (p.Trp221Arg) has not previously been reported in MISSLA. However, the variant in trans c.1433C>T (p.Pro478Leu) has previously been described in MISSLA [24] (American College of Medical Genetics (ACMG) criterion PM3). c.661T>C (p.Trp221Arg) is neither found in the Exome Aggregation Consortium’s (ExAC) database [35] nor in the database of the 1000 Genomes Project [36] (ACMG criterion PM2). Trp221 is highly conserved (Suppl. Fig. 2B), with a phyloP [37] score of 5.1 and this variant was predicted to be disease causing by MutationTaster [38] (ACMG criterion PP3). Given the patients’ highly MISSLA suggestive phenotype (e.g. skeletal malformations, lethal lung hypoplasia) and family history (ACMG criterion PP4) we classified c.661T>C (p.Trp221Arg) as “likely pathogenic” according to the ACMG guidelines for the interpretation of sequence variants [39] (2PM + 2PP). Variant and phenotypic data were submitted to the Leiden Open Variation Database (www.LOVD.nl/DONSON) Individual ID # 00207473.

Control cohorts

To test the ability of Face2Gene RESEARCH to recognize MISSLA and FA we constructed three control image cohorts (Smith-Lemli-Opitz syndrome (SLOS), dysmorphic and non-dysmorphic).

Dysmorphic and Smith-Lemli-Opitz control cohorts

We searched the archive of the Institute of Medical Genetics and Human Genetics of the Charité Berlin for photos of patients with a molecularly confirmed diagnosis who did not have FA, MISSLA or SLOS and consented to the use of their images for scientific purposes. A maximum of two cases of the same family was used. One hundred and eighty-three images of 116 patients featuring 88 different syndromes were, eventually, used for the cohort of dysmorphic patients (Suppl. Table 1). Eighty images of 65 SLOS patients from of the SLOS cohort originally used by Pantel et al. [13] were used as an additional control cohort (Suppl. Table 1) [13].

Non-dysmorphic control cohort

To build a cohort of non-dysmorphic individuals, we searched online for 116 publicly accessible portrait photos matching the 116 cases of the dysmorphic control cohort regarding age, gender and ethnicity (Suppl. Table 1). Matching was performed because other factors than facial dysmorphism (such as ethnicity) may confound Face2Gene [2, 13]. Absence of facial dysmorphism was evaluated by at least two clinicians.

Suitable portrait photos

Portrait photos had to meet the following criteria: a picture had to feature an individual confirmed to have one of the disorders analysed in this study, the patient should not have any other genetic disorder and the entire face of the patient (hairline, both eyes, nose, mouth and chin) had to be recognizable (see Suppl. Fig. 1 for an example). If necessary, photos were manually trimmed prior to analysis by Face2Gene.

Image classification and statistical analysis

Image classifications were performed as described by Liehr et al. [8] with the Face2Gene RESEARCH application (v. 19.1.0). Cross-validation was performed according to the default settings of the application (method: holdout, splits: 10). Next to Face2Gene CLINIC which evaluates patient photos using the pretrained neural network DeepGestalt [4], Face2Gene RESEARCH enables the design of user-specific classifiers. When the Face2Gene RESEARCH application is trained with different numbers of cases per class, this may bias the classifier [13]. As the MISSLA cohort was the smallest consisting of just nine images, we used these nine images and randomly selected nine cases of each of the other cohorts to run a classification experiment. This process was repeated 50 times (Fig. 2a), in order to minimize a potential bias of the nine images specifically selected per cohort.

Fig. 2
figure 2

a Study design, composite syndrome images of (A1) MISSLA, (A2) FA, (A3) dysmorphic control, (A4) non-dysmorphic control, and (A5) SLOS control individuals. b Boxplot of the confusion matrix results. Rows depict actual classes; columns depict predicted classes. Correct classifications are shown in blue, false classifications in red. Dashed line indicates 20% of accuracy (expected for random chance). Note: all mean true positive rates above 20%. c Matthew’s Correlation Coefficients (MCCs). Dashed line indicates 0 (expected for random chance). Note: all mean MCCs above 0. d Results of the binary classification. Distribution of the AUCs under the receiver operating characteristic curves and their standard deviations

Face2Gene’s accuracy of the binary comparison of MISSLA and FA cohorts was assessed by measuring the mean area under the curve (AUC) of the receiver operating characteristic curve. The power to compare and separate all five classes (MISSLA, FA, SLOS, dysmorphic, non-dysmorphic controls) was evaluated by measuring true positive rates (TPRs), false positive rates (FPRs) and calculating Matthews Correlation Coefficient (MCC) defined as

$${{\mathrm{{MCC}}} = ({{{\rm{TP}} \times {\rm{TN}} - {\rm{FP}} \times {\rm{FN}})}}/\sqrt {({{{\rm{TP}} + {\rm{FP}})\;({\rm{TP}} + {\rm{FN}})\;({\rm{FP}} + {\rm{TN}})\;({\rm{TN}} + {\rm{FN}})}}}}$$

with TP = true positives, TN = true negatives, FP = false positives, FN = false negatives.

A value of 1 reflects a perfect classification, a value of 0 a purely random result and a negative value a result worse than that of a random classifier.

To test a possible overfitting of the RESEARCH application, we also performed 50 classification experiments randomly labeling the 9 cases of each class of the actual experiments to one of 5 classes (Suppl. Fig. 3). Results were compared to the 50 actual experiments using a two-sided t-test.

Results

Gestalt composite images

To visualize the average appearance of the individuals of our cohorts, we created composite images using Face2Gene RESEARCH (Fig. 2a1–a5). The MISSLA composite shows distinct facial features, such as a broad nose with underdeveloped nasal alae, short and downslanted palpebral fissures, and microstomia. The FA composite indicates small eyes and microstomia. The dysmorphic composite shows no specific facial appearance, as it contains a wide number of different genetic disorders. The SLOS composite depicts anteverted nares a broad nasal bridge and ptosis.

Multiclass comparison

To determine the distinguishability of MISSLA syndrome and FA we performed a computer-aided image classification of MISSLA, FA, SLOS and dysmorphic patients, and non-dysmorphic control individuals using the Face2Gene RESEARCH software. Notably, all median TPRs were above 20% (p < 0.05), i.e. better than the expected value for an evenly distributed result achieved by a 5-class random classifier (Fig. 2b). The highest TPRs were seen for the MISSLA cohort with a median TPR of 84% (p = 2.7 ×10−49). The median FPRs of other cohorts wrongly classified as MISSLA are lower than 20% (Fig. 2b) (p < 0.05). Remarkably, on average, only 6% of the MISSLA images were classified as FA (p = 9.9 ×1013).

The second highest median TPR (66%, p = 1.3 ×  10−32) was observed in the SLOS cohort, Notably, the FA cohort’s TPR amounted to 44% (p = 6.4 × 10−20), indicating distinct facial features in some of the patients of the FA cohort.

Interestingly, of the 20 median FPRs only the rate of non-dysmorphic images classified as dysmorphic was significantly higher than expected by random chance (34%, p = 1.7 ×10−7) (Fig. 2b).

All five MCCs show positive values (Fig. 2c), indicating a classification better than random chance (p < 0.05). The highest MCCs were found for the class MISSLA (median MCC: 0.75, p = 3.0 × 10−52) the lowest MCCs for the dysmorphic class (median MCC: 0.19, p = 42.7 × 10−12).

Binary comparison

To directly test the distinguishability of MISSLA and FA, we also used the RESEARCH app for a binary comparison of the two cohorts. Mean AUCs were particularly high (mean 0.91, p = 1.6 ×10−21) with only low standard deviations (Fig. 2d), suggesting that the two syndromes differ in their facial appearance.

Discussion and conclusion

Here, we report the clinical and molecular data of two so far unpublished siblings with MISSLA, caused by the known disease-causing variant c.1433C>T (p.Pro478Leu) and the novel variant c.661T>C;(p.Trp221Arg) in DONSON. We summarize the phenotypes of all previously published MISSLA cases and present the first study to use a computer-aided image analysis approach to distinguish MISSLA and FA.

MISSLA and FA can be differentiated by computer-aided image analysis

Despite phenotypic overlap, our results reveal that MISSLA has a typical face that can be clearly distinguished from FA by computer-assisted image analysis. The high TPRs and MCCs for MISSLA and FA also indicate that they differ from the other control groups. Whether FA has specific facial characteristics is sometimes debated. Our work supports the assumption of Avila et al. [40] that there is a characteristic facial gestalt of FA. It is, however, possible that the TPRs of FA are lower than those of MISSLA and SLOS because not all FA patients feature these characteristic facial features. Knaus et al. [11] showed that DeepGestalt can classify patients of the same phenotypic series according to their mutated gene. Our FA cohort is based on FA patients featuring variants in different genes, thus specific facial features may only be caused by some of these.

Notably, in their investigation of Emmanuel and Pallister-Killian syndromes, Liehr et al. [8] reported better differentiation between dysmorphic and non-dysmorphic cases than we have reported here. This might be due to the following reasons: (a) they used more images than we did, (b) their data set consisted of four not five classes, (c) their cohorts differed in the numbers of images used to build them, which according to Pantel et al. [13] may confound the classification process, and (d) their cohorts were not matched regarding age, gender, and ethnicity which could have confounded classification, too [2].

Pantel et al. [13] have demonstrated that more than nine images are required to build the best possible DeepGestalt-classifier for a given syndrome. Unfortunately, the number of published images of unrelated MISSLA patients, suitable for Face2Gene, is just 9. This shows the need for scientifically monitoring patients with rare, clinically variable diseases like MISSLA, and to share the findings.

MISSLA is not a new subtype of FA

One reason to assume that MISSLA represents a specific disease entity were mitomycin C test results. In contrast to FA, Milner et al. [21] found no increased number of chromosomal aberrations after mitomycin C testing of MISSLA patients. However, Evrony et al. [23] and Schulz et al. [24] did find those. Some patients were even originally diagnosed with FA. This raised the question whether regarding the strong clinical overlap and similarities in the cellular phenotype MISSLA should be seen as a new subtype of FA.

Since haematological symptoms, commonly seen in FA patients, have not been described in MISSLA, since heterozygote DONSON variant carriers have not been associated with an increased tumour risk, since there are specific differences in skeletal malformations between MISSLA and FA, and in particular since MISSLA has a characteristic facies, we still assume that it is a specific clinical entity.

Further research necessary

Face2Gene RESEARCH is supposed to work with images of different quality. We, however, did not test this.

Further research is also needed to determine the potential of other image analysis programmes, or a combination of such programmes for the distinction of MISSLA and FA. Although achieving a certain precision in the identification of MISSLA, FA, and other syndromes, our results are improvable. Our main obstacle was the limited number of suitable photos. The control experiments’ results make an overfitting of Face2gene RESEARCH to our limited data set unlikely. However, a larger number of photos would enable more detailed investigation of the facial MISSLA phenotype and analysis of a potential facial genotype–phenotype correlation in FA patients. The classifiers presented in this study did not reach the accuracy of a clinician in differentiating dysmorphic and non-dysmorphic faces.

However, we conclude computer-assisted image analysis can support the subjective task of classical clinical diagnosis, potentially helping to identify MISSLA among clinically diagnosed FA patients who were not genetically confirmed.