Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The translation of non-canonical open reading frames controls mucosal immunity


The annotation of the mammalian protein-coding genome is incomplete. Arbitrary size restriction of open reading frames (ORFs) and the absolute requirement for a methionine codon as the sole initiator of translation have constrained the identification of potentially important transcripts with non-canonical protein-coding potential1,2. Here, using unbiased transcriptomic approaches in macrophages that respond to bacterial infection, we show that ribosomes associate with a large number of RNAs that were previously annotated as ‘non-protein coding’. Although the idea that such non-canonical ORFs can encode functional proteins is controversial3,4, we identify a range of short and non-ATG-initiated ORFs that can generate stable and spatially distinct proteins. Notably, we show that the translation of a new ORF ‘hidden’ within the long non-coding RNA Aw112010 is essential for the orchestration of mucosal immunity during both bacterial infection and colitis. This work expands our interpretation of the protein-coding genome and demonstrates that proteinaceous products generated from non-canonical ORFs are crucial for the immune response in vivo. We therefore propose that the misannotation of non-canonical ORF-containing genes as non-coding RNAs may obscure the essential role of a multitude of previously undiscovered protein-coding genes in immunity and disease.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Bacterial infection drives widespread ribosomal association with non-coding RNAs.
Fig. 2: LPS triggers genome wide differential translation of non-canonical ORFs in lncRNAs.
Fig. 3: Translation of the non-canonical Aw112010 encoded ORF is essential for mucosal immunity.
Fig. 4: Translation of the Aw112010 non-canonical ORF encoded protein is required for IL-12 production.

Data availability

RNA-seq, RiboTag RNA-seq, and ribosome profiling data that support the findings of this study have been deposited in the Gene Expression Omnibus (GEO) repository with the accession code GSE120762. All lncRNA RNA-seq, RiboTagSeq, ribosome profiling sequencing and analysis can also be found in Supplementary Table 1.


  1. Couso, J. P. & Patraquim, P. Classification and function of small open reading frames. Nat. Rev. Mol. Cell Biol. 18, 575–589 (2017).

    Article  CAS  Google Scholar 

  2. Kozak, M. Regulation of translation in eukaryotic systems. Annu. Rev. Cell Biol. 8, 197–225 (1992).

    Article  CAS  Google Scholar 

  3. Guttman, M., Russell, P., Ingolia, N. T., Weissman, J. S. & Lander, E. S. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013).

    Article  CAS  Google Scholar 

  4. Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011).

    Article  CAS  Google Scholar 

  5. Hoagland, M. B., Stephenson, M. L., Scott, J. F., Hecht, L. I. & Zamecnik, P. C. A soluble ribonucleic acid intermediate in protein synthesis. J. Biol. Chem. 231, 241–257 (1958).

    CAS  PubMed  Google Scholar 

  6. Sanz, E. et al. Cell-type-specific isolation of ribosome-associated mRNA from complex tissues. Proc. Natl Acad. Sci. USA 106, 13939–13944 (2009).

    Article  ADS  CAS  Google Scholar 

  7. Clausen, B. E., Burkhardt, C., Reith, W., Renkawitz, R. & Förster, I. Conditional gene targeting in macrophages and granulocytes using LysMcre mice. Transgenic Res. 8, 265–277 (1999).

    Article  CAS  Google Scholar 

  8. Mudge, J. M. & Harrow, J. Creating reference gene annotation for the mouse C57BL6/J genome assembly. Mamm. Genome 26, 366–378 (2015).

    Article  CAS  Google Scholar 

  9. Pruitt, K. D., Tatusova, T., Brown, G. R. & Maglott, D. R. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res. 40, D130–D135 (2012).

    Article  CAS  Google Scholar 

  10. Kozomara, A. & Griffiths-Jones, S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 42, D68–D73 (2014).

    Article  CAS  Google Scholar 

  11. Carpenter, S. et al. A long noncoding RNA mediates both activation and repression of immune response genes. Science 341, 789–792 (2013).

    Article  ADS  CAS  Google Scholar 

  12. Kotzin, J. J. et al. The long non-coding RNA Morrbid regulates Bim and short-lived myeloid cell lifespan. Nature 537, 239–243 (2016).

    Article  ADS  CAS  Google Scholar 

  13. Osuna, B. A., Howard, C. J., Kc, S., Frost, A. & Weinberg, D. E. In vitro analysis of RQC activities provides insights into the mechanism and function of CAT tailing. eLife 6, e27949 (2017).

    Article  Google Scholar 

  14. Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

    Article  ADS  CAS  Google Scholar 

  15. Ji, Z., Song, R., Regev, A. & Struhl, K. Many lncRNAs, 5′UTRs, and pseudogenes are translated and some are likely to express functional proteins. eLife 4, e08890 (2015).

    Article  Google Scholar 

  16. Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    Article  ADS  CAS  Google Scholar 

  17. Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

    Article  ADS  CAS  Google Scholar 

  18. Kondo, T. et al. Small peptides switch the transcriptional activity of Shavenbaby during Drosophila embryogenesis. Science 329, 336–339 (2010).

    Article  ADS  CAS  Google Scholar 

  19. Kearse, M. G. & Wilusz, J. E. Non-AUG translation: a new start for protein synthesis in eukaryotes. Genes Dev. 31, 1717–1731 (2017).

    Article  CAS  Google Scholar 

  20. Stothard, P. The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques 28, 1102, 1104 (2000).

    Article  CAS  Google Scholar 

  21. Wang, H., Wang, Y., Xie, S., Liu, Y. & Xie, Z. Global and cell-type specific properties of lincRNAs with ribosome occupancy. Nucleic Acids Res. 45, 2786–2796 (2017).

    CAS  PubMed  Google Scholar 

  22. Xiao, Z. et al. De novo annotation and characterization of the translatome with ribosome profiling data. Nucleic Acids Res. 46, e61–e61 (2018).

    Article  Google Scholar 

  23. D’Lima, N. G. et al. A human microprotein that interacts with the mRNA decapping complex. Nat. Chem. Biol. 13, 174–180 (2017).

    Article  Google Scholar 

  24. de Jong, R. et al. Severe mycobacterial and Salmonella infections in interleukin-12 receptor-deficient patients. Science 280, 1435–1438 (1998).

    Article  ADS  Google Scholar 

  25. Lehmann, J. et al. IL-12p40-dependent agonistic effects on the development of protective innate and adaptive immunity against Salmonella enteritidis. J. Immunol. 167, 5304–5315 (2001).

    Article  CAS  Google Scholar 

  26. Mannon, P. J. et al. Anti-interleukin-12 antibody for active Crohn’s disease. N. Engl. J. Med. 351, 2069–2079 (2004).

    Article  CAS  Google Scholar 

  27. Neurath, M. F., Fuss, I., Kelsall, B. L., Stüber, E. & Strober, W. Antibodies to interleukin 12 abrogate established experimental colitis in mice. J. Exp. Med. 182, 1281–1290 (1995).

    Article  CAS  Google Scholar 

  28. Pulak, R. & Anderson, P. mRNA surveillance by the Caenorhabditis elegans smg genes. Genes Dev. 7, 1885–1897 (1993).

    Article  CAS  Google Scholar 

  29. Martin, L. et al. Identification and characterization of small molecules that inhibit nonsense-mediated RNA decay and suppress nonsense p53 mutations. Cancer Res. 74, 3104–3113 (2014).

    Article  CAS  Google Scholar 

  30. Nowarski, R. et al. Epithelial IL-18 equilibrium controls barrier function in colitis. Cell 163, 1444–1456 (2015).

    Article  CAS  Google Scholar 

  31. Gabanyi, I. et al. Neuro-immune interactions drive tissue programming in intestinal macrophages. Cell 164, 378–391 (2016).

    Article  CAS  Google Scholar 

  32. Obrig, T. G., Culp, W. J., McKeehan, W. L. & Hardesty, B. The mechanism by which cycloheximide and related glutarimide antibiotics inhibit peptide synthesis on reticulocyte ribosomes. J. Biol. Chem. 246, 174–181 (1971).

    CAS  PubMed  Google Scholar 

  33. Schneider-Poetsch, T. et al. Inhibition of eukaryotic translation elongation by cycloheximide and lactimidomycin. Nat. Chem. Biol. 6, 209–217 (2010).

    Article  CAS  Google Scholar 

  34. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    Article  CAS  Google Scholar 

  35. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  Google Scholar 

  36. Zhang, H., Meltzer, P. & Davis, S. RCircos: an R package for Circos 2D track plots. BMC Bioinformatics 14, 244 (2013).

    Article  Google Scholar 

  37. Phanstiel, D. H., Boyle, A. P., Araya, C. L. & Snyder, M. P. Sushi.R: flexible, quantitative and integrative genomic visualizations for publication-quality multi-panel figures. Bioinformatics 30, 2808–2810 (2014).

    Article  CAS  Google Scholar 

  38. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  Google Scholar 

  39. Frolova, L. et al. A highly conserved eukaryotic protein family possessing properties of polypeptide chain release factor. Nature 372, 701–703 (1994).

    Article  ADS  CAS  Google Scholar 

  40. Pertea, M. et al. Thousands of large-scale RNA sequencing experiments yield a comprehensive new human gene list and reveal extensive transcriptional noise. Preprint at (2018).

  41. Wessel, D. & Flügge, U. I. A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Anal. Biochem. 138, 141–143 (1984).

    Article  CAS  Google Scholar 

  42. Slavoff, S. A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol. 9, 59–64 (2013).

    Article  CAS  Google Scholar 

  43. Gundry, R. L. et al. Preparation of proteins and peptides for mass spectrometry analysis in a bottom-up proteomics workflow. Curr. Protoc. Mol. Biol. Chapter 10, Unit10.25 (2009).

    PubMed  Google Scholar 

  44. Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S. P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl Acad. Sci. USA 100, 6940–6945 (2003).

    Article  ADS  CAS  Google Scholar 

  45. Chen, L. M., Kaniga, K. & Galán, J. E. Salmonella spp. are cytotoxic for cultured macrophages. Mol. Microbiol. 21, 1101–1115 (1996).

    Article  CAS  Google Scholar 

  46. Gruber, A. R., Lorenz, R., Bernhart, S. H., Neuböck, R. & Hofacker, I. L. The Vienna RNA websuite. Nucleic Acids Res. 36, W70–W74(2008).

    Article  CAS  Google Scholar 

  47. Xu, D. & Zhang, Y. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins 80, 1715–1735 (2012).

    Article  CAS  Google Scholar 

Download references


We thank J. Alderman, C. Lieber, C. Hughes, L. Evangelisti, E. Hughes-Picard, E. Ryke, L. Machado and C. Castaldi for help in facilitating this work. We thank J. Galan and H. Sun for providing the S. Typhimurium and for discussion on the infection model. We would thank R. Nowarski, M. Healy, K. Baker and N. Palm for comments and discussion on the manuscript. This work was supported by the Howard Hughes Medical Institute and the Blavatnik Family Foundation (R.A.F.). This work was supported in part by the Searle Scholars Program, the Leukemia Research Foundation, an American Cancer Society Institutional Research Grant Individual Award for New Investigators (IRG-58-012-57), the NIH (R01GM122984), and Yale University West Campus start-up funds (to S.S.). A.K. was in part supported by an NIH Predoctoral Training Grant (5T32GM06754 3-12) (S.S).

Reviewer information

Nature thanks M. Raffatellu and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

Authors and Affiliations



R.J. conceived the project, performed experiments, analysed the data and wrote the manuscript. L.K. performed all bioinformatics analysis and aided in writing of the manuscript. A.K. preformed all the mass spectrometry experiments, analysed data and aided in writing of the manuscript. W.B. performed experiments and contributed major conceptual insight into the work. A.J. and A.G.Y. participated in experimental design, conducted experiments, analysed data and offered vital conceptual insight. O.M.K., J.R.B., M.H.S. and C.D. performed experiments and analysed data. C.C.D.H. helped with bioinformatics analysis and provided conceptual discussion. L.C., P.B., A.G.S. and H.R.S. helped with experiments. S.S. supervised all mass spectrometry work and contributed to the overall interpretation of this work. R.A.F. supervised the project, helped interpret the work and supervised writing of the manuscript.

Corresponding author

Correspondence to Richard A. Flavell.

Ethics declarations

Competing interests

R.A.F. is a scientific advisor to GlaxoSmithKline. All other authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 RiboTag RNA isolation and mRNA expression.

a, Nanodrop analysis of ribosome-associated RNA isolated from RiboTag (Cre) and RiboTagLysM (Cre+) mice showing no detected RNA isolated from Cre BMDMs. b, Bioanalyzer (Agilent) traces and RNA integrity number (RIN) of ribosome-associated RNA isolated from BMDMs from RiboTagLysM mice non-treated or stimulated with LPS for 6 or 24 h. c, qPCR analysis of ribosome-associated transcripts from non-treated BMDMs, or BMDMs stimulated with LPS (10 ng ml−1) or infected with S. Typhimurium at an MOI of 1 for 6 h. Data are mean ± s.e.m. from six biological replicates.

Extended Data Fig. 2 Ribosome profiling, RNA-seq read tracing and RibORF analysis.

ad, Wild-type BMDMs were non-treated or stimulated with LPS (10 ng ml−1) for 6 h and ribosome profiling was conducted. Data are representative of two biological replicates. a, Pattern of RNA-seq transcriptional reads (red) and ribosome profiling translational reads (blue) for Il12b from non-treated (top) and LPS-stimulated (bottom) BMDMs. The gene structure of Il12b is located in the centre, with a thin blue line representing the introns and wide blue rectangles indicating exonic structure. Thinner exonic structures represent annotated 5ʹ and 3′ UTRs. b, Pattern of RNA-seq transcriptional reads (red) and ribosome profiling translational reads (blue) for a non-RiboTag-identified lncRNA, A130088B15rik, from non-treated (top) and LPS-stimulated (bottom) BMDMs. The gene structure of A130088B15rik is located in the centre, with a thin blue line representing the introns and wide blue rectangles indicating exonic structure. c, Pattern of RNA-seq transcriptional reads (red) and ribosome profiling translational reads (blue) for a RiboTag-identified lncRNA, Aw112010, from non-treated (top) and LPS-stimulated (bottom) BMDMs. The gene structure of Aw112010 is located in the centre, with a thin blue line representing the introns and wide blue rectangles indicating exonic structure. d, RibORF analysis of read distribution (reads per million mappable reads; RPM) around start and stop codons of known, annotated protein-coding genes in steady state and LPS-stimulated samples.

Extended Data Fig. 3 Breakdown of different analytical approaches to predict protein coding lncRNAs.

a, RiboCode analysis of ribosome-profiling data identifies 85 ORFs within lncRNAs with protein-coding potential. b, Comparison of non-canonical ORFs identified by RibORF, RRS and translation efficiency, and RiboCode analytical strategies from BMDMs expressing lncRNA using ribosome profiling. c, Wild-type BMDMs were non-treated or stimulated with LPS (10 ng ml−1) for 6 h and ribosome profiling was conducted. Data are representative of two biological replicates. Volcano plot of LPS-induced differentially regulated genes identified by RibORF, RiboCode, RRS and translation efficiency analysis.

Extended Data Fig. 4 Overexpression of non-canonical ORFs reveals distinct subcellular localization.

a, HEK293 cells were transfected with 500 ng of Flag-tagged plasmids encoding the non-canonical ORFs GM7160 and GM9895. Cells were fixed and stained with DAPI (blue, nucleus), phalloidin (red, cytoskeletal F-actin) and anti-Flag (green, ORF of interest). Original magnifications, ×60 and ×100. Data are representative of three or more independent experiments.

Extended Data Fig. 5 Aw112010HA mouse characterization.

a, Schematic representation of Aw112010HA knock-in mice. b, Genotyping for Aw112010HA mice from CRISPR–Cas9 injections. c, Sequence information for GGSG(×3)–HA-tag insertion used to generate Aw112010HA mice. d, Wild-type and AW112010HA BMDMs were left untreated or stimulated with LPS (10 ng ml−1) for 6 h. Protein lysates were generated and incubated overnight with anti-haemaggultinin magnetic beads. Purified lysates were probed for haemagglutinin by western blot. Whole-cell lysates were used as a loading control and probed for β-tubulin. Data are representative of three independent experiments.

Extended Data Fig. 6 Mass spectrometry validation of the Aw112010 ORF.

a, Tandem mass spectrometry (MS/MS) fragmentation of an endogenous peptide from Aw112010 found in LPS-stimulated macrophages after haemagglutinin immunoprecipitation purification. Identified fragment ions (b and y ions, red) are indicated above and below the peptide sequence. b, Aw112010 predicted protein sequence. Peptides detected by mass spectrometry are highlighted in red and blue. c, Mass spectrometry information for the identified fragments shown in Fig. 2h.

Extended Data Fig. 7 Characterization of Aw112010Stop mice.

a, Schematic representation of Aw112010HA knock-in (KI) mice. b, Genotyping for Aw112010Stop mice generated using CRISPR–Cas9. Het, heterozygous. c, Sanger sequencing of the frameshifting stop codon insertion in Aw112010Stop mice and wild-type controls. d, Sequence of frameshifting stop insertion. Stop codons and frame positions are indicated below the sequence.

Extended Data Fig. 8 Cytokine production in wild-type and Aw112010Stop macrophages and mice.

ac, Wild-type and Aw112010Stop BMDMs were stimulated with LPS for indicated times and supernatants were analysed for IL-12p40, IL-6 and IL-10 by ELISA. Data are from six biological replicates conducted over two independent experiments. d, Mice were administered PBS (n = 5) or LPS (n = 6, WT and Aw112010Stop) (10 mg kg−1) for 6 h via intraperitoneal injection. Serum was analysed for IL-6 by ELISA. Error bars denote s.e.m. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, unpaired two-tailed t-test.

Extended Data Fig. 9 The Aw112010 protein is required for IL-12-40 production.

ac, Predicted protein structure of the Aw112010 non-canonical ORF-encoded protein generated using the Quark package. Aw112010 is predicted to contain a single transmembrane domain. d, BMDMs were subjected to electroporation with indicated plasmids. BMDMs were stimulated with LPS (10 ng ml−1) for 6 h. Supernatants were analysed for IL-12p40 production by ELISA. Data are representative of three biological replicates. ***P < 0.001, unpaired two-tailed t-test.

Supplementary information

Supplementary Information

This file contains Supplementary Figure 1: Uncropped Western Blots.

Reporting Summary

Supplementary Table

This file contains Supplementary Table 1.

Source data

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jackson, R., Kroehling, L., Khitun, A. et al. The translation of non-canonical open reading frames controls mucosal immunity. Nature 564, 434–438 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing