Congenital diarrhea and enteropathies (CODEs) are a heterogeneous group of inherited disorders that present with severe chronic diarrhea in the first few months of life, often requiring nutritional support and extensive medical care. The classification and new diagnostic approach for CODE have recently been reviewed.1 Based on next-generation sequencing, rare Mendelian diseases have been identified among patients with CODE. Early recognition and prompt intervention by means of precision medicine are crucial for CODE patients and their families.

The aim of this study was to perform a sequencing analysis of a large cohort of patients with CODEs and to investigate their clinical diagnoses and subsequent management with implications for precision medicine.


Patient cohort

Patients were recruited retrospectively between 1 January 2013 and 1 September 2018 at Children’s Hospital of Fudan University with the following inclusion criteria: chronic diarrhea in early infancy. Patients with acquired diarrhea, including allergic disorders, neonatal necrotizing enterocolitis, and short bowel syndrome were excluded. Clinical manifestations; laboratory results including abdominal computed tomography, upper and lower endoscopy, and histology results; treatment; and prognosis were reviewed.

This study was approved by the Ethical Committee of Children’s Hospital, Fudan University. Informed consent for participation and blood sample collection was obtained from the parents of the patients.

Genetic sequencing and variant assessment

Exome sequencing was performed among patients. Genomic DNA was extracted from the peripheral whole blood of patients and their parents using the Agilent SureSelectXT Human All Exon 50-Mb kit. Exome sequencing resulted in an average 100× coverage using the Illumina HiSeq2000/2500 sequencer (Illumina, San Diego, CA, USA). Patients who presented in the neonatal period underwent sequencing with a 2742-gene panel, patients with primary immunodeficiency confirmed by immunological investigations underwent sequencing with a 270-gene panel, and the rest of them underwent sequencing with a 3530-gene panel containing reported single-gene disorders. Sequence read alignments were completed using Novoalign (V2.07.18) against the human reference genome GRCh37.p10 ( The bioinformatics pipeline was applied as previously reported.2 In brief, after quality control, variants were filtered by means of public databases, including Human Gene Mutation Database (HGMD) Professional, the Exome Aggregation Consortium (ExAC), and an in-house database. Sanger sequencing was performed with a Biosystems 3500 DNA Analyzer and analyzed by Mutation Surveyor V4.0.9 to confirm the causal variants.

Statistical analysis

Data were analyzed using SPSS 24.0 for Windows (SPSS Inc., Chicago, IL, USA). Continuous data were presented as the mean and standard deviation (SD) or median and interquartile range (IQR). Categorical variables were reported as the frequency and percentage. The chi-square test or Fisher’s exact test were used to compare categorical variables. A two-tailed value for p < 0.05 was considered statistically significant.


Demographic features of patients

We enrolled 137 CODE patients. Among them, 83 (60.6%) were males and 54 (39.4%) were females. Consanguineous marriage was found for one patient, for whom the grandparents were cousins. Eighteen patients (13.1%) had a family history of enteropathy in early infancy. The median age of disease onset was 28.0 (IQR: 7.5–120.0) days. Patients were classified based on diarrhea type: 27 patients had watery diarrhea, 7 had steatorrhea, and 103 had bloody diarrhea. Detailed demographic and phenotypic information is shown in Table 1. There were no significant differences in mortality between the different subgroups (p = 0.166). Malignancy was identified in one patient who had non-Hodgkin lymphoma. Intestinal perforation was identified in 9 patients (6.6%).

Table 1 Demographic features and phenotypes of congenital diarrhea and enteropathy (CODE) patients

In our cohort, 73 patients were screened using trio exome sequencing, and 64 patients underwent capture-based targeted sequencing. Sixty-nine patients (50.4%) had disease onset in the neonatal period, and 59 (43.1%) had disease onset within 2 years of age.

Endoscopy, histology, and imaging

In this study, 103 patients underwent upper endoscopy, and 107 patients underwent lower endoscopy and histological evaluations.

In the watery diarrhea group, abnormal crypt/villus architecture was found in 10 patients (37.0%), dilated lymphatics were found in 7 patients (25.9%), and normal crypt and villus architecture were found in 5 patients (18.5%). The remaining 5 patients (18.5%) could not tolerate or refuse endoscopy. Histological data showed villus atrophy with epithelial tufts in patient 101, which was described previously.3 Hematoxylin and eosin (H&E) staining revealed epithelial apoptosis and loss of Paneth cells in the colonic biopsies of patient 48 (Fig. 1a, b). Duodenal biopsies of patient 75 showed flattened villi, while further electron microscopy showed microvillus atrophy, vesicular bodies, and numerous lysosomes in the same patient. Flattened villi and abundant lysosomes were also noted in the enterocytes of patient 105 by electron microscopy. For patient 104, histological analysis of duodenum biopsies showed lymphoid infiltrates and epithelial apoptosis by light microscopy. However, there was no significant finding by electron microscopy.

Fig. 1
figure 1

Histologic and immunohistochemical assessments of CODE patients. (a) Hematoxylin and eosin (H&E) analysis of the descending duodenum shows the loss of goblet cells and Paneth cells and the presence of apoptotic cells in patient 48. (b) H&E analysis of the small bowel shows hypoplasia of villi and crypt, with monocytic infiltration in all layers. (c) H&E analysis of colonic biopsy reveals histiocytic infiltration and irregular cellularity in patient 99. (d) Immunochemical staining shows CD1a marker positivity in the same patient.

In the bloody diarrhea group, colonic ulcerations were found in some patients. Histological analysis revealed intestinal crypt abscesses and inflammatory infiltration. The colonic biopsy of patient 99 revealed histiocytic infiltration and CD1a marker positivity by immunohistochemistry (Fig. 1c, d). The plain film showed lytic bone lesions in the skull, ribs, and tibia in this patient.

In the fatty diarrhea group, diffuse pancreatic lipomatosis was noted in the enhanced computed tomography (CT) of the abdomen in all patients. Metaphyseal dysplasia on plain films was identified in three patients (Fig. 2a, b).

Fig. 2
figure 2

Radiologic findings of patients with fatty diarrhea. (a) Enhanced computed tomography (CT) of the abdomen shows pancreatic lipomatosis in patient 89. (b) A lower limb plain film shows metaphyseal dysplasia in patient 89.

Molecular diagnostic outcome

Trio exome sequencing and capture-based targeted sequencing confirmed the genetic diagnosis in 88 of the 137 patients (64.2%) (Table S1). No pathogenic variant was found among the remaining 49 patients. The pathogenic variant spectrum is shown in Fig. 3. The identified gene distribution with different onsets of age is shown in Fig. 4. A summary of all pathogenic variant types of 17 genes is provided in Table S2.

Fig. 3
figure 3

Bar plot for the number of genes identified among the congenital diarrhea and enteropathy (CODE) patients.

Fig. 4
figure 4

Gene distributions of identified pathogenic variants with different onsets of age.

Although consanguineous marriage was found for one patient, homozygous variants were confirmed in the other 18 patients.

The mode of inheritance was autosomal recessive for ten of the genes (CCBE1, DGAT1, DUOX2, EPCAM, IL10RA, LRBA, MYO5B, SBDS, SLC5A1, and UBR1) and autosomal dominant for three genes (TNFAIP3, CARD11, and ELANE). Four genes (CYBB, FOXP3, WAS, and IKBKG) were found to be X-linked recessive.

The diagnostic rate was 68.0% (70/103) in the bloody diarrhea subgroup, 48.1% (13/27) in the watery diarrhea subgroup, and 71.4% (5/7) in the fatty diarrhea subgroup (p = 0.07). The diagnostic rate was higher in the neonatal group than in the group of patients who had disease onset within 2 years of age (75.4% vs. 57.6%, p = 0.033).

Correlation between histological findings and genetic sequencing

There was no significant correlation between the diagnostic yields of sequencing and histopathology in patients with watery diarrhea (p = 0.544). No pathogenic variant was found in patient 48, patient 104, or patient 105, while histopathology revealed abnormal villi in these patients. Among patients with markedly dilated lymphatics in duodenal biopsies, we identified one patient with pathogenic DGAT1 variants.

Genetic spectrum and clinical implications in subgroups with different phenotypes

Patients with watery stool

Among patients presenting with watery diarrhea, we identified pathogenic variants in the CARD11, CCBE1, EPCAM, FOXP3, MYO5B, DGAT1, and SLC5A1 genes. Eight patients (29.6%) were diagnosed with actionable medical disorders due to variants in the EPCAM, FOXP3, DGAT1, and SLC5A1 genes. Pathogenic variants were found in the SLC5A1 gene in patient 130. This patient presented with severe diarrhea, dehydration, and hypernatremia during the neonatal period. He was diagnosed with congenital glucose–galactose malabsorption (CGM, OMIM 606824). A low-carbohydrate diet was instituted. His condition markedly improved. We also identified EPCAM pathogenic variants in patient 101 and patient 20. These patients were diagnosed with tufting enteropathy (OMIM 613217). Neither immunosuppressants nor hematopoietic stem cell transplantation (HSCT) were recommended for them because these treatments were unlikely to be curative.4 These two patients improved with total parenteral nutrition.

Next-generation sequencing confirmed pathogenic variants in FOXP3 in patient 73, patient 74, and patient 125. These patients were diagnosed with immune dysregulation–polyendocrinopathy–enteropathy–X-linked syndrome (IPEX, OMIM 304790). Patient 73 received HSCT and is now in clinical remission. Patient 125 received anti-TNF therapy as a bridge to HSCT. Patient 74 was lost to follow up.

A low-fat enteral diet can markedly improve diarrhea in patients with DGAT1 pathogenic variants.5 However, two patients (patient 77 and patient 78) died before their exome sequencing results were available.

Unfortunately, microvillus inclusion disease (OMIM 251850) due to MYO5B pathogenic variants is a rare disease with poor prognosis. The parents of patient 75 withdrew treatment, and the patient died.

Patients with fatty stool

In the fatty diarrhea subgroup, variants were found in the UBR1 and SBDS genes. Three patients (patient 89, patient 91, and patient 92) were identified with SBDS pathogenic variants and diagnosed with Shwachman–Diamond syndrome (OMIM 260400). Two patients (patient 88 and patient 90) had variants in the UBR1 gene and were diagnosed with Johanson–Blizzard syndrome (OMIM 243800). No pathogenic variant was identified in patient 93 or 94, who underwent exome sequencing. All seven patients received pancreatic enzyme replacement therapy. Some of them (n = 4) have been previously described.6

Patients with bloody stool

We identified pathogenic variants in the IL10RA, TNFAIP3, LRBA, CYBB, IKBKG, DUOX2, WAS, and ELANE genes in the bloody diarrhea subgroup. There were 112 pathogenic variants found in the IL10RA gene in 56 patients with very early-onset inflammatory bowel disease (OMIM 613148). Thirty-one (55.4%) patients were treated with thalidomide, and 22 patients required surgery. Twenty-six (46.4%) patients received HSCT, which was considered to be a curative therapy.7 Among them, 25 patients underwent umbilical cord blood transplantation (UCBT), and one patient received haploidentical transplantation. Twenty patients were in clinical remission after HSCT, and the other six patients died. Some of the patients have been previously reported.8,9,10,11 The overall mortality rate was 25.0% for the 56 patients with IL10RA variants. There were no significant differences in mortality rare between patients receiving HSCT or not (p = 0.757).

Based on next-generation sequencing and immunological evaluations, ten patients were diagnosed with primary immunodeficiency due to pathogenic variants identified in CYBB (n = 7; chronic granulomatous disease, OMIM 306400), IKBKG (n = 1; anhidrotic ectodermal dysplasia with immunodeficiency, OMIM 300291), and ELANE (n = 2; severe congenital neutropenia, OMIM 202700). Among the patients with CYBB variants, two patients were treated with HSCT, and three were treated with interferon gamma and prophylactic antibiotics. However, HSCT is also likely to be curative for patients with Wiskott–Aldrich syndrome (OMIM 301000). The parents of patient 32 refused treatment and the patient was lost to follow up.

Histological staining and immunohistochemical analysis confirmed the diagnosis of Langerhans cell histiocytosis in patient 99. She received induction chemotherapy and responded to treatment. Based on the molecular diagnosis, the therapeutic approach was changed for three patients (patient 21, patient 50, and patient 51) with TNFAIP3 pathogenic variants from biologics to thalidomide, which has been described previously.12


Most CODEs are monogenic in nature and can be divided into genetic variants that directly affect the intestinal epithelium or the immune system.1 To the best of our knowledge, this is the largest cohort of CODE patients. Molecular diagnosis is crucial for CODE patients, as the results can inform potential therapeutic targets.4 HSCT is likely to be curative for monogenic diseases caused by pathogenic variants of the IL10, IL10RA, IL10RB, CYBB, CYBA, NCF1, NCF2, NCF4, FOXP3, and XIAP7 genes and is not helpful for EPCAM or TTC7A pathogenic variants.4,13

Petersen et al. reported that targeted gene panel sequencing revealed monogenic disease in 5 of 71 patients with early-onset inflammatory bowel disease and chronic diarrhea.14 In another large cohort of inflammatory bowel disease patients presenting before 2 years of age, 31% of 62 patients had monogenic diseases.15 In our study, the diagnostic rates were 71.9% (46/64) for targeted gene panel sequencing and 57.5% (42/73) for exome sequencing (p = 0.081). Targeted gene panel sequencing may have higher diagnostic rates than exome sequencing due to selected coverage and included genes.16 Overall, we reported a higher percentage of patients with monogenic diseases than previous reports.15,17

Endoscopy, H&E analysis, and specific immunohistochemistry can further enhance the diagnostic process for CODE patients.1 In this study, we identified one patient with pathogenic MYO5B variants. To date, there are only two microvillus inclusion disease cases that have been confirmed by genetic sequencing in the Chinese population.18,19 Neither of these two cases underwent endoscopy. In our study, the H&E analysis and electron microscopy results were consistent with microvillus inclusion disease, and pathogenic MYO5B variants were confirmed by sequencing. A recent study demonstrated, by immunoelectron microscopy, that abnormal Rab11-Rab8-vesicles clustered in the enterocytes of patients, which further emphasized its pathophysiological significance.20 However, no significant correlation was found between the diagnostic yields of sequencing and histopathological analysis in the watery diarrhea subgroup. These CODE patients might have variants of novel genes.1

Patients with watery stool

In addition to microvillus inclusion disease, many epithelial-specific CODE disorders require lifelong parenteral nutrition and are not suitable for HSCT.1 In this study, two patients were identified with EPCAM pathogenic variants by panel sequencing and exome sequencing, respectively. Their condition improved after total parenteral nutrition. A long-term follow-up study of 13 tufting enteropathy patients showed a survival rate of more than 92% for those who relied on parenteral nutrition at home.21 There was only one CGM case with SLC5A1 pathogenic variants in the Chinese population.22 In this study, we examined the second case by panel sequencing. Because of its rarity, the diagnosis of CGM remains challenging for pediatricians.23 Fortunately, the classical symptoms of early-onset watery diarrhea and dehydration, as well as prompt sequencing, aid in early recognition. The patient had immediate cessation of diarrhea after receiving a low-carbohydrate diet.

In the watery diarrhea group, we also identified two patients with novel homozygous DGAT1 pathogenic variants by exome sequencing. However, these two patients died before we applied precision medicine based on genetic diagnosis. These patients have been described in detail elsewhere.24

Patient 53 was identified with a de novo CARD11 pathogenic variant by exome sequencing. He had a history of asthma, rhinitis, and eczema. He was referred to our institution because of recurrent abdominal pain and oral ulcers. Lower endoscopic analysis showed multiple colonic ulcers. Recently, a large cohort of patients with hypomorphic CARD11 pathogenic variants was described by Dorjbal et al.25 In addition to atopic diseases, IPEX-like diseases, ulcerative colitis, and Crohn disease have been reported in patients with CARD11 variants.25 However, there are currently no established curative therapies.25 Only in vitro studies have indicated that exogenous glutamine can partially restore defects in CARD11mut T-cell responses.26

Patients with fatty stool

In the fatty diarrhea group, the diagnostic rate was 5/7 (71.4%). We have identified pathogenic variants in the SBDS and UBR1 genes. Recently, variants in other genes, including DNAJC21, EFL1, and SRP54, have also been reported in patients with congenital fatty diarrhea due to exocrine pancreatic insufficiency.27,28,29 No pathogenic variant in these aforementioned genes was identified in the two patients with negative sequencing results.

Only seven patients were included in the fatty diarrhea group. All patients had pancreatic insufficiency, as they responded to pancreatic enzyme replacement therapy.30 The rarity of these entities might be one of the contributing factors. Patients with chylomicron retention disease and abetalipoproteinemia due to intestinal fat malabsorption also present with chronic diarrhea and steatorrhea.31 Hypocholesterolemia and decreased total cholesterol are pathognomonic in the lipid profile.32 We did not identify patients with these conditions in our cohort. Some patients remained undiagnosed or misdiagnosed because of the significantly variable clinical phenotype and nonspecific symptoms of these conditions.33,34 Congenital fatty diarrhea is still a challenge for pediatricians, and next-generation sequencing can lead to the identification of molecular aberrations and early treatment.35

Patients with bloody stool

For CODE patients with bloody stools, some are characterized by very early-onset inflammatory bowel diseases. Together with our previous studies, we hereby describe the largest cohort of 56 patients with IL10RA pathogenic variants.9 Twenty-seven had synonymous T179T variants, which cause exon skipping and out-of-frame fusion of exons 3 and 536. This variant impairs RNA splicing and leads to altered STAT3 phosphorylation.36

Of these patients, one patient was identified with compound heterozygous variants in the DUOX2 gene by exome sequencing. She harbored two missense variants, and one was reported to be disease causing. Biallelic DUOX2 pathogenic variants were recently identified as one cause of very early-onset inflammatory bowel disease.37 This patient was also diagnosed with congenital hypothyroidism. DUOX2 deficiency was also one of the most common genetic defects in Chinese congenital hypothyroidism patients.38 Until now, there has been no pathway-specific intervention.37 This patient suffered from intestinal perforation and received ileostomy. Her condition improved after parenteral nutrition.

However, we did not identify novel pathogenic variants among our CODE patients. Some of the novel variants were not included in the targeted panel. An increasing number of recent studies have shown that pathogenic variants in the TGFB1 and WNT2B genes can cause CODE.39,40 Moreover, this was a large single-center study on CODE patients. The sequencing method of patients was chosen individually and based on present practice due to the retrospective study setting. Further evaluations are required to identify novel pathogenic variants in CODE patients based on a larger prospective study.

In conclusion, the majority of CODE patients had monogenic diseases. We suggest that next-generation sequencing be applied early in the disease course for these patients. Prompt pathway-specific therapies can improve the prognoses of CODE patients.