Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Noncoding deletions reveal a gene that is critical for intestinal function



Large-scale genome sequencing is poised to provide a substantial increase in the rate of discovery of disease-associated mutations, but the functional interpretation of such mutations remains challenging. Here we show that deletions of a sequence on human chromosome 16 that we term the intestine-critical region (ICR) cause intractable congenital diarrhoea in infants1,2. Reporter assays in transgenic mice show that the ICR contains a regulatory sequence that activates transcription during the development of the gastrointestinal system. Targeted deletion of the ICR in mice caused symptoms that recapitulated the human condition. Transcriptome analysis revealed that an unannotated open reading frame (Percc1) flanks the regulatory sequence, and the expression of this gene was lost in the developing gut of mice that lacked the ICR. Percc1-knockout mice displayed phenotypes similar to those observed upon ICR deletion in mice and patients, whereas an ICR-driven Percc1 transgene was sufficient to rescue the phenotypes found in mice that lacked the ICR. Together, our results identify a gene that is critical for intestinal function and underscore the need for targeted in vivo studies to interpret the growing number of clinical genetic findings that do not affect known protein-coding genes.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the human and mouse locus and key findings.
Fig. 2: Enhancer activity of the ICR, and phenotypes of chr17ΔICR/ΔICR mice.
Fig. 3: Discovery of a gene, Percc1, that flanks the ICR.
Fig. 4: PERCC1 is abundant in G cells and its genetic disruption impairs the expression of gastrointestinal peptide hormones and the development of EECs.

Data availability

All RNA-seq data used in this study have been deposited in the GEO repository (National Center for Biotechnology Information). The files are accessible through the GEO accession number GSE94245. The cDNA and predicted protein sequence for Percc1 are available in GenBank (record KY964488). All other relevant data are available from the corresponding authors on request.


  1. Avery, G. B., Villavicencio, O., Lilly, J. R. & Randolph, J. G. Intractable diarrhea in early infancy. Pediatrics 41, 712–722 (1968).

    CAS  PubMed  Google Scholar 

  2. Straussberg, R. et al. Congenital intractable diarrhea of infancy in Iraqi Jews. Clin. Genet. 51, 98–101 (1997).

    Article  CAS  Google Scholar 

  3. Bamshad, M. J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).

    Article  CAS  Google Scholar 

  4. Canani, R. B. & Terrin, G. Recent progress in congenital diarrheal disorders. Curr. Gastroenterol. Rep. 13, 257–264 (2011).

    Article  Google Scholar 

  5. Qu, H. & Fang, X. A brief review on the Human Encyclopedia of DNA Elements (ENCODE) project. Genomics Proteomics Bioinformatics 11, 135–141 (2013).

    Article  CAS  Google Scholar 

  6. Calo, E. & Wysocka, J. Modification of enhancer chromatin: what, how, and why? Mol. Cell 49, 825–837 (2013).

    Article  CAS  Google Scholar 

  7. Eeckhoute, J. et al. Cell-type selective chromatin remodeling defines the active subset of FOXA1-bound enhancers. Genome Res. 19, 372–380 (2009).

    Article  CAS  Google Scholar 

  8. Pennacchio, L. A. et al. In vivo enhancer analysis of human conserved non-coding sequences. Nature 444, 499–502 (2006).

    Article  ADS  CAS  Google Scholar 

  9. Dimaline, R. & Varro, A. Novel roles of gastrin. J. Physiol. 592, 2951–2958 (2014).

    Article  CAS  Google Scholar 

  10. Barker, N. et al. Lgr5+ve stem cells drive self-renewal in the stomach and build long-lived gastric units in vitro. Cell Stem Cell 6, 25–36 (2010).

    Article  CAS  Google Scholar 

  11. Haber, A. L. et al. A single-cell survey of the small intestinal epithelium. Nature 551, 333–339 (2017).

    Article  ADS  CAS  Google Scholar 

  12. Helander, H. F. & Fändriks, L. The enteroendocrine “letter cells” – time for a new nomenclature? Scand. J. Gastroenterol. 47, 3–12 (2012).

    Article  CAS  Google Scholar 

  13. Spence, J. R. et al. Directed differentiation of human pluripotent stem cells into intestinal tissue in vitro. Nature 470, 105–109 (2011).

    Article  ADS  Google Scholar 

  14. Mellitzer, G. et al. Loss of enteroendocrine cells in mice alters lipid absorption and glucose homeostasis and impairs postnatal survival. J. Clin. Invest. 120, 1708–1721 (2010).

    Article  CAS  Google Scholar 

  15. Thiagarajah, J. R. et al. Advances in evaluation of chronic diarrhea in infants. Gastroenterology 154, 2045–2059 (2018).

    Article  Google Scholar 

  16. Osterwalder, M. et al. Enhancer redundancy provides phenotypic robustness in mammalian development. Nature 554, 239–243 (2018).

    Article  ADS  CAS  Google Scholar 

  17. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  18. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

  19. Ge, D. et al. SVA: software for annotating and visualizing sequenced human genomes. Bioinformatics 27, 1998–2000 (2011).

    Article  CAS  Google Scholar 

  20. Zhu, M. et al. Using ERDS to infer copy-number variants in high-coverage genomes. Am. J. Hum. Genet. 91, 408–421 (2012).

    Article  CAS  Google Scholar 

  21. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protocols 7, 562–578 (2012).

    Article  CAS  Google Scholar 

  22. Bockenhauer, D. et al. Epilepsy, ataxia, sensorineural deafness, tubulopathy, and KCNJ10 mutations. N. Engl. J. Med. 360, 1960–1970 (2009).

    Article  CAS  Google Scholar 

  23. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  Google Scholar 

  24. Huang, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protocols 4, 44–57 (2009).

    Article  CAS  Google Scholar 

  25. Lindemann, S. R. et al. The epsomitic phototrophic microbial mat of Hot Lake, Washington: community structural responses to seasonal cycling. Front. Microbiol. 4, 323 (2013).

    Article  Google Scholar 

  26. Wang, H. et al. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell 153, 910–918 (2013).

    Article  CAS  Google Scholar 

  27. Yang, H. et al. One-step generation of mice carrying reporter and conditional alleles by CRISPR/Cas-mediated genome engineering. Cell 154, 1370–1379 (2013).

    Article  CAS  Google Scholar 

  28. Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823–826 (2013).

    Article  ADS  CAS  Google Scholar 

  29. Kvon, E. Z. et al. Progressive loss of function in a limb enhancer during snake evolution. Cell 167, 633–642 (2016).

    Article  CAS  Google Scholar 

  30. Barrett, T. et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41, D991–D995 (2013).

    Article  CAS  Google Scholar 

  31. Warlich, E. et al. Lentiviral vector design and imaging approaches to visualize the early stages of cellular reprogramming. Mol. Ther. 19, 782–789 (2011).

    Article  CAS  Google Scholar 

  32. McCracken, K. W., Howell, J. C., Wells, J. M. & Spence, J. R. Generating human intestinal tissue from pluripotent stem cells in vitro. Nat. Protocols 6, 1920–1928 (2011).

    Article  CAS  Google Scholar 

  33. Glusman, G., Caballero, J., Mauldin, D. E., Hood, L. & Roach, J. C. Kaviar: an accessible system for testing SNV novelty. Bioinformatics 27, 3216–3217 (2011).

    Article  CAS  Google Scholar 

Download references


The authors thank the patients and their families for their cooperation and support. This work was supported by grants to D.L. from the SysKid EU FP7 project (241544), the Wolfson Family Charitable Trust and the Crown Human Genome Center at the Weizmann Institute of Science. A.V. and L.A.P. were supported by NHLBI grant R24HL123879 and NHGRI grants R01HG003988, U54HG006997 and UM1HG009421; J.M.W. and Y.A. were supported by the Cincinnati Children’s Hospital and Sheba Medical Center’s Joint Research Fund; J.M.W. and M.F.K. were supported by NIH grants 1R01DK092456 and 1U18NS080815 as well as a digestive disease center grant (P30 DK0789392); R.K. was supported by the David and Elaine Potter Charitable Foundation; B.L.B. was supported by NIH grants HL089707, HL064658 and HL136182; and I. Barozzi was funded through an Imperial College Research Fellowship. Research was conducted at the E. O. Lawrence Berkeley National Laboratory and performed under the Department of Energy contract DE-AC02-05CH11231 (University of California). iPS cell lines were generated in collaboration with the Cincinnati Children’s Pluripotent Stem Cell Facility. This work was performed in partial fulfilment of the requirements for a PhD degree for D.O.-L. (Weizmann Institute of Science, Rehovot, Israel) and I.B.-J. (The Sackler Faculty of Medicine, Tel Aviv University, Israel).

Author information

Authors and Affiliations



C.H., R. Shamir, R. Shapiro, B.W., B.P.-S., P.T., I. Barshack, E.P. and Y.A. recruited patients, provided patient care and characterized the symptoms. B.W., B.P.-S. and Y.A. obtained biopsies. D.O.-L., D.B.G., E.K.R. and Y.H. managed the sequencing, D.O.-L., T.O., I. Barozzi and A.A. performed the bioinformatic analyses and discovered the mutation. M.T., H.C.S. and R.K. provided SNP genotyping and linkage analysis. I.B.-J., D.M.-Y., H.R.-W., M.O., E.S.M.V., R.M., M.S., I. Barshack and W.d.L. did experimental work, including mutation characterization. Y.Z., M.O., A.S.N., V.A., D.M.I., D.C.-D., D.E.D., K.L.v.B., R.M.B., B.L.B., A.V. and L.A.P. did the transgenic and knockout mouse generation and characterization. J.M.W., M.F.K., A.P. and C.N.M. did the iPS cell generation and human intestinal organoid studies. D.O.-L., A.V., L.A.P. and D.L. wrote the manuscript. R. Shamir, D.B.G., E.P., L.A.P., D.L. and Y.A. provided leadership to the project. All authors contributed to the final manuscript.

Corresponding authors

Correspondence to Doron Lancet, Yair Anikster or Len A. Pennacchio.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Family pedigrees.

Filled black symbols represent affected individuals, and deletion genotypes are indicated in red. WES was done for individuals 1.1, 2.1, 3.1, 4.1 and 4.2, WGS was done for individual 2.1 and transcriptome analysis was done for individuals 2.1 and 2.4. Patient 1.1 (marked with an asterisk) was found to have uniparental disomy.

Extended Data Fig. 2 Genetic analysis in IDIS.

a, Analysis of the SNP genotyping that was performed on 6 of the patients in families 1–5 and their 22 relatives detected a single significant telomeric linkage interval on chr16 with a maximum LOD score of 4.26. Haplotype reconstruction confirmed this interval, with flanking marker rs207435 (chr16: 2,984,868), and showed two distinct disease haplotypes either in a homozygous setting (in affected individuals for disease allele 1 (that is, ΔL) in families 2, 3, 5) or in a compound heterozygous setting (in affected individuals for disease alleles 1 and 2 (that is, ΔS) in family 4). All the affected individuals who carried disease allele 1 showed an identical disease haplotype from rs533184 (chr16: 1,155,025) to rs397435 (chr16: 2,010,138). b, Schematic of reads covering exons in the C16orf91 gene, for the five exome-sequenced patients and for three unaffected controls who underwent sequencing under identical conditions. The first three patients (individuals 1.1, 2.1 and 3.1), who had a chr16ΔL/ΔL genotype, had zero coverage in the three upstream exons (right). The last two patients (individuals 4.1 and 4.2), who had a chr16ΔL/ΔS genotype, had non-zero coverage in these exons, but coverage was lower than in controls. All subjects had high coverage in the downstream exons (left). Numbers indicate the scale in sequencing reads per base.

Extended Data Fig. 3 Targeted deletion of the ICR noncoding sequence in mice.

a, Overview of targeting approach. See Methods for details. b, Genotyping results obtained from genomic DNA (n = 554) isolated from the tails of homozygous and heterozygous ICR-knockout (ΔICR) mice, compared to a wild-type control. See Methods for primers and details. c, Percc1 expression derived from RNA-seq from control littermates (left) and knockout mice (right). Tissues and time points are indicated to the left of each plot.

Extended Data Fig. 4 Gastrointestinal and microbiome analysis in chr17ΔICR/ΔICR mice.

a, Modified intestinal content in wild-type mice (left) compared to chr17ΔICR/ΔICR mice (right; n = 45) at P10. bd, ICR deletion causes changes in intestinal and faecal microbiome composition. Microbial communities in different intestinal compartments and faeces were analysed by 16S rRNA-based sequence profiling. b, Family-level relative abundance profiles of the top 15 most abundant prokaryotic families for wild-type (n = 22) and chr17ΔICR/ΔICR (n = 21) intestinal and faecal samples, organized by sample type. The most pronounced changes were observed in colon and faecal samples. c, Heat map of log-transformed read counts for those genera that exhibited the greatest variance (top 60%) across all faecal samples. The abundance profiles exhibit perfect clustering of the faecal samples (rows) into wild-type (n = 6) and chr17ΔICR/ΔICR (n = 7) groups. d, Bar charts of Shannon’s diversity for all faecal samples from b, grouped into wild-type and chr17ΔICR/ΔICR samples.

Extended Data Fig. 5 Gastrointestinal X-gal staining of ICR-reporter transgenic embryos compared to Percc1 mRNA in situ hybridization.

a, b, Cross-sections of E14.5 mouse tissues with a β-galactosidase ICR-driven transgene. c, d, Percc1 mRNA in situ hybridization analysis on E14.5 wild-type sections. For X-gal staining and in situ hybridization experiments, two embryos for each experiment and each condition were collected at E14.5 and a minimum of three sections from each embryo were examined. Representative sections are shown.

Extended Data Fig. 6 Analysis of body weight in Percc1-knockout and transgenic mice.

a, Comparison of weight in Percc1-knockout mice (n = 38; red) and littermate controls (n = 25; blue), showing that Percc1-knockout mice have reduced body weight. Percc1-knockout mice were generated in an FVB/N genetic background. b, Percc1 transgenic rescue of the body weight phenotype that is found in chr17ΔICR/ΔICR mice. An 8.5-kb Percc1 mini gene was constructed (Supplementary Table 10) and used to generate a Percc1 mouse line that overexpressed PERCC1. When this transgene was introduced into the chr17ΔICR/ΔICR mouse genetic background, we observed the rescue of all the phenotypes (including the severe reduction in body weight) that were found in chr17ΔICR/ΔICR mice. Chr17ΔICR/ΔICR mice were generated in a mixed 129/C57Bl6 background. P values were determined using a two-tailed t-test; n.s. indicates a P value of 0.8–1.0. Lines show the mean and shaded areas represent ±1 s.d.

Extended Data Fig. 7 Characterization of PERCC1 in mice and patients.

a, Western blot analysis of PERCC1–mCherry fusion protein. Two stable transgenic lines (B3269 and B3309) were established through standard pronuclear microinjection of fertilized mouse eggs. Protein extracts from juvenile mice (P13–P14) were separated by SDS–PAGE and transferred for western hybridization. Lanes: 1, molecular mass marker (M); 2 and 3, line B3269; 4 and 5, line B3309; 6, wild-type control; 7, mCherry positive control; 8, molecular mass marker. mCherry is predicted to be 28.8 kDa and the PERCC1–mCherry fusion protein is predicted to be 59 kDa, with both proteins running about 5 kDa larger. Line B3309 does not express the fusion protein, in contrast to line B3269 (probably owing to a position effect). These experiments were performed four times. be, Identification of cells with PERCC1+ identity, and the effect of PERCC1 ablation in gastrointestinal tissues. b, A subpopulation of PERCC1+ cells (red) in the corpus epithelium (mucosa) expresses SYP (green) at P8. Arrowheads mark double-positive cells. Arrows mark a minor fraction of PERCC1+ cells that were detected in longitudinal smooth muscle (lSM). DAPI-stained nuclei are shown in blue. c, Dispersed PERCC1+ cells (red) are observed in the villi of the duodenum at P8. Top, cross-section through villi illustrates the absence of endocrine identity (green) in these cells. Bottom, sagittal section showing the distribution of PERCC1+ cells in the epithelium of villi (CDH1; green). d, Top left, schematic depicting the anatomical compartments of the distal stomach and the location of sections used for cell counting. Top right, reduction of the fraction of G cells observed predominantly in the pyloric antrum of Percc1-deficient chr17ΔICR/ΔICR) mice at P8. Box plots indicate median (centre line), interquartile values (box limits), range (whiskers), outliers (circled dots) and individual biological replicates (dots). P values were determined using an unpaired two-tailed t-test. Bottom, comparative immunofluorescence analysis illustrating the reduced number of gastrin-expressing cells (red) in the absence of Percc1 (chr17ΔICR/ΔICR) in the antrum at P8. SYP-expressing endocrine cells are green and nuclei are grey. e, Immunofluorescence from HIOs derived from control (ICR+/+) and patient (ICRΔL/ΔL) iPS cell lines. Detection of anti-FOXA2 (blue) and anti-CDH1 (red) was used to visualize the HIO epithelium, and EECs were localized at 21 and 42 days on the basis of SYP expression (green) and counted. The average number of SYP+ cells (NSYP+) per 1,000 epithelial (Epi) cells from cell counts in n = 2 technical replicates from independent HIO preparations is indicated (P = 1.75 × 10−18 for reduced number of SYP+ cells in ICRΔL/ΔL HIOs; Fisher’s exact test). n represents independent biological replicates with similar results. Scale bars, 50 μm.

Extended Data Fig. 8 Percc1 analysis of single-cell transcriptomes from mouse intestine.

a, Left, bar chart showing the fraction of the total cells profiled in a previous study11 (n = 11,665) that was assigned to each one of the major cell types identified. Right, the same information, but limited to those cells that express Percc1 (n = 8). b, Same as a but limited to EECs. P values were calculated using a chi-squared test, using data from the corresponding left panel as reference (a, b). EP, epithelial, TA, transit-amplifying. c, Box plots showing the distributions of the normalized gene-expression values for known EEC-associated transcription factors and hormones in the eight Percc1-positive cells from a. Box plots indicate median (centre line), interquartile range (IQR; box limits) and 1.5 × IQR (whiskers).

Extended Data Fig. 9 Validation of human RNA-seq data by RT–qPCR in duodenal tissue from two different patients and control tissue.

Pairwise comparison of the relative gene-expression levels of six peptide hormones (cholecystokinin (CCK), gastrin (GAST), glucagon (GCG), gastric inhibitory polypeptide (GIP), neurotensin (NTS) and somatostatin (SST)) in duodenal tissue from patients and normal duodenal tissue (control; represented as 1). Relative expression levels for patients represent the average between two patients (patients 1.1 and 5.1).

Extended Data Fig. 10 Characterization of HOIs and iPS cell lines.

a, HIOs generated from an affected patient, a carrier and an unaffected sibling all show normal morphology. Differentiation into HIOs was performed in duplicate with qPCR and histological analyses that yielded similar results. b, iPS cell lines from an affected patient, a carrier and an unaffected sibling display a normal karyotype. This was a single experiment for each sample, as an assessment of quality control.

Supplementary information

Supplementary Information

This file contains additional information on IDIS patients including genetic and transcriptome results and Supplementary Tables 5-11.

Reporting Summary

Supplementary Table 1

Differentially expressed genes (DEGs) in stomach and intestine of P10 mice and human patients. For each species and organ under consideration, a list of up- and down-regulated genes is provided. For each DEG, the gene symbol along with the RPKM (Reads Per Kilobase per Million mapped reads) in KO and WT individuals, the log2-fold-change, the p-value (estimated using Cuffdiff, n = 1 biological replicates) and the q-value (Benjamini-Hochberg) are reported.

Supplementary Table 2

Intersection of DEGs in the mouse with the corresponding murine orthologs of the human DEGs. For each organ and separately for up- and down-regulated genes, those DEGs common to both species are listed along with those unique for each species. p-values indicating the probability of observing an equal or better overlap by chance are provided (hypergeometric test).

Supplementary Table 3

Functional enrichment analysis of DEGs. For each species and separately for up- and down-regulated genes, functional enrichment was run using DAVID ( The results of Functional Annotation Clustering (default parameters) are provided. The most representative GO and KEGG terms are highlighted in red. These terms were used for the prioritization strategy reported in Table 1. P-value estimated using the modified Fisher’s exact test (EASE Score).

Supplementary Table 4

Clinical characteristics of congenital diarrhea patients with deleted enhancer.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oz-Levi, D., Olender, T., Bar-Joseph, I. et al. Noncoding deletions reveal a gene that is critical for intestinal function. Nature 571, 107–111 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing