Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

mRNA recognition and packaging by the human transcription–export complex

Abstract

Newly made mRNAs are processed and packaged into mature ribonucleoprotein complexes (mRNPs) and are recognized by the essential transcription–export complex (TREX) for nuclear export1,2. However, the mechanisms of mRNP recognition and three-dimensional mRNP organization are poorly understood3. Here we report cryo-electron microscopy and tomography structures of reconstituted and endogenous human mRNPs bound to the 2-MDa TREX complex. We show that mRNPs are recognized through multivalent interactions between the TREX subunit ALYREF and mRNP-bound exon junction complexes. Exon junction complexes can multimerize through ALYREF, which suggests a mechanism for mRNP organization. Endogenous mRNPs form compact globules that are coated by multiple TREX complexes. These results reveal how TREX may simultaneously recognize, compact and protect mRNAs to promote their packaging for nuclear export. The organization of mRNP globules provides a framework to understand how mRNP architecture facilitates mRNA biogenesis and export.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Structure of an ALYREF–EJC oligomer.
Fig. 2: Structure of an endogenous TREX–mRNP complex.
Fig. 3: TREX–mRNP model and protein crosslinking.
Fig. 4: Architectures of TREX–mRNP complexes.
Fig. 5: Model for mRNA packaging.

Similar content being viewed by others

Data availability

The 3D cryo-EM density maps of the ALYREF(55–182)–EJC–RNA complex, the TREX–EJC–RNA complex, the TREX–mRNA complex (composite map and maps A, B and C), the THO–UAP56 complex (maps D and E) have been deposited into the Electron Microscopy Data Bank under the accession numbers EMD-14803, EMD-16633, EMD-14804, EMD-14805, EMD-14806, EMD-14807, EMD-14808 and EMD-14809, respectively. The coordinate file of the ALYREF–EJC–RNA, TREX–mRNA and THO–UAP56 complexes have been deposited into PDB under the accession numbers 7ZNJ, 7ZNK and 7ZNL, respectively. The TREX–mRNA subtomogram average maps were deposited under the accession number EMD-16753. The raw cryo-tomography data and reconstructed tomograms were deposited into the Electron Microscopy Public Image Archive (EMPIAR) under the accession number EMPIAR-11465. Protein crosslinking data have been deposited into jPOST and ProteomeXchange with the accession codes JPST001488 and PXD031755, respectively.

References

  1. Köhler, A. & Hurt, E. Exporting RNA from the nucleus to the cytoplasm. Nat. Rev. Mol. Cell Biol. 8, 761–773 (2007).

    Article  PubMed  Google Scholar 

  2. Singh, G., Pratt, G., Yeo, G. W. & Moore, M. J. The clothes make the mRNA: past and present trends in mRNP fashion. Annu. Rev. Biochem. 84, 325–354 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Khong, A. & Parker, R. The landscape of eukaryotic mRNPs. RNA 26, 229–239 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Heath, C. G., Viphakone, N. & Wilson, S. A. The role of TREX in gene expression and disease. Biochem. J 473, 2911–2935 (2016).

    Article  CAS  PubMed  Google Scholar 

  5. Xie, Y. et al. Cryo-EM structure of the yeast TREX complex and coordination with the SR-like protein Gbp2. eLife 10, e65699 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Cheng, H. et al. Human mRNA export machinery recruited to the 5′ end of mRNA. Cell 127, 1389–1400 (2006).

    Article  CAS  PubMed  Google Scholar 

  7. Gromadzka, A. M., Steckelberg, A.-L., Singh, K. K., Hofmann, K. & Gehring, N. H. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs. Nucleic Acids Res. 44, 2348–2361 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Shi, M. et al. ALYREF mainly binds to the 5′ and the 3′ regions of the mRNA in vivo. Nucleic Acids Res. 45, 9640–9653 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Gatfield, D. et al. The DExH/D box protein HEL/UAP56 is essential for mRNA nuclear export in Drosophila. Curr. Biol. 11, 1716–1721 (2001).

    Article  CAS  PubMed  Google Scholar 

  10. Sträßer, K. et al. TREX is a conserved complex coupling transcription with messenger RNA export. Nature 417, 304–308 (2002).

    Article  PubMed  ADS  Google Scholar 

  11. Zenklusen, D., Vinciguerra, P., Wyss, J.-C. & Stutz, F. Stable mRNP formation and export require cotranscriptional recruitment of the mRNA export factors Yra1p and Sub2p by Hpr1p. Mol. Cell. Biol. 22, 8241–8253 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Viphakone, N. et al. TREX exposes the RNA-binding domain of Nxf1 to enable mRNA export. Nat. Commun. 3, 1006 (2012).

    Article  PubMed  ADS  Google Scholar 

  13. Viphakone, N. et al. Co-transcriptional loading of RNA export factors shapes the human transcriptome. Mol. Cell 75, 310–323.e8 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Huertas, P. & Aguilera, A. Cotranscriptionally formed DNA:RNA hybrids mediate transcription elongation impairment and transcription-associated recombination. Mol. Cell 12, 711–721 (2003).

    Article  CAS  PubMed  Google Scholar 

  15. Adivarahan, S. et al. Spatial organization of single mRNPs at different stages of the gene expression pathway. Mol. Cell 72, 727–738.e5 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Ashkenazy-Titelman, A., Atrash, M. K., Boocholez, A., Kinor, N. & Shav-Tal, Y. RNA export through the nuclear pore complex is directional. Nat. Commun. 13, 5881 (2022).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  17. Davidson, I. F. & Peters, J.-M. Genome folding through loop extrusion by SMC complexes. Nat. Rev. Mol. Cell Biol. 22, 445–464 (2021).

    Article  CAS  PubMed  Google Scholar 

  18. Metkar, M. et al. Higher-order organization principles of pre-translational mRNPs. Mol. Cell 72, 715–726.e3 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Piovesan, A. et al. Human protein-coding genes and gene feature statistics in 2019. BMC Res. Notes 12, 315 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Singh, G. et al. The cellular EJC interactome reveals higher-order mRNP structure and an EJC–SR protein nexus. Cell 151, 750–764 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Zhou, Z. et al. The protein Aly links pre-messenger-RNA splicing to nuclear export in metazoans. Nature 407, 401–405 (2000).

    Article  CAS  PubMed  ADS  Google Scholar 

  22. Le Hir, H., Izaurralde, E., Maquat, L. E. & Moore, M. J. The spliceosome deposits multiple proteins 20–24 nucleotides upstream of mRNA exon–exon junctions. EMBO J. 19, 6860–6869 (2000).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Bono, F., Ebert, J., Lorentzen, E. & Conti, E. The crystal structure of the exon junction complex reveals how it maintains a stable grip on mRNA. Cell 126, 713–725 (2006).

    Article  CAS  PubMed  Google Scholar 

  24. Andersen, C. B. F. et al. Structure of the exon junction core complex with a trapped DEAD-box ATPase bound to RNA. Science 313, 1968–1972 (2006).

    Article  CAS  PubMed  ADS  Google Scholar 

  25. Rodrigues, J. P. et al. REF proteins mediate the export of spliced and unspliced mRNAs from the nucleus. Proc. Natl Acad. Sci. USA 98, 1030–1035 (2001).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  26. Portman, D. S., O’Connor, J. P. & Dreyfuss, G. YRA1, an essential Saccharomyces cerevisiae gene, encodes a novel nuclear protein with RNA annealing activity. RNA 3, 527–537 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Pühringer, T. et al. Structure of the human core transcription–export complex reveals a hub for multivalent interactions. eLife 9, e61503 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Hautbergue, G. M. et al. UIF, a new mRNA export adaptor that works together with REF/ALY, requires FACT for recruitment to mRNA. Curr. Biol. 19, 1918–1924 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Dufu, K. et al. ATP is required for interactions between UAP56 and two conserved mRNA export proteins, Aly and CIP29, to assemble the TREX complex. Genes Dev. 24, 2043–2053 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Schuller, S. K. et al. Structural insights into the nucleic acid remodeling mechanisms of the yeast THO–Sub2 complex. eLife 9, e61467 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Luo, M.-J. et al. Pre-mRNA splicing and mRNA export linked by direct interactions between UAP56 and Aly. Nature 413, 644–647 (2001).

    Article  CAS  PubMed  ADS  Google Scholar 

  32. Fica, S. M., Oubridge, C., Wilkinson, M. E., Newman, A. J. & Nagai, K. A human postcatalytic spliceosome structure reveals essential roles of metazoan factors for exon ligation. Science 363, 710–714 (2019).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  33. Tunnicliffe, R. B., Tian, X., Storer, J., Sandri-Goldin, R. M. & Golovanov, A. P. Overlapping motifs on the herpes viral proteins ICP27 and ORF57 mediate interactions with the mRNA export adaptors ALYREF and UIF. Sci Rep. 8, 15005 (2018).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  34. Viphakone, N. et al. Luzp4 defines a new mRNA export pathway in cancer cells. Nucleic Acids Res. 43, 2353–2366 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Chang, C.-T. et al. Chtop is a component of the dynamic TREX mRNA export complex. EMBO J. 32, 473–486 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Merz, C., Urlaub, H., Will, C. L. & Lührmann, R. Protein composition of human mRNPs spliced in vitro and differential requirements for mRNP protein recruitment. RNA 13, 116–128 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Huang, Y., Yario, T. A. & Steitz, J. A. A molecular link between SR protein dephosphorylation and mRNA export. Proc. Natl Acad. Sci. USA 101, 9666–9670 (2004).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  38. Batisse, J., Batisse, C., Budd, A., Böttcher, B. & Hurt, E. Purification of nuclear poly(A)-binding protein Nab2 reveals association with the yeast transcriptome and a messenger ribonucleoprotein core structure. J. Biol. Chem. 284, 34911–34917 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Skoglund, U., Andersson, K., Strandberg, B. & Daneholt, B. Three-dimensional structure of a specific pre-messenger RNP particle established by electron microscope tomography. Nature 319, 560–564 (1986).

    Article  CAS  PubMed  ADS  Google Scholar 

  40. Ren, Y., Schmiege, P. & Blobel, G. Structural and biochemical analyses of the DEAD-box ATPase Sub2 in association with THO or Yra1. eLife 6, e20070 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Montpetit, B. et al. A conserved mechanism of DEAD-box ATPase activation by nucleoporins and InsP6 in mRNA export. Nature 472, 238–242 (2011).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  42. Sträßer, K. & Hurt, E. Splicing factor Sub2p is required for nuclear mRNA export through its interaction with Yra1p. Nature 413, 648–652 (2001).

    Article  PubMed  ADS  Google Scholar 

  43. Zenklusen, D., Vinciguerra, P., Strahm, Y. & Stutz, F. The yeast hnRNP-like proteins Yra1p and Yra2p participate in mRNA export through interaction with Mex67p. Mol. Cell. Biol. 21, 4219–4232 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Murachelli, A. G., Ebert, J., Basquin, C., Le Hir, H. & Conti, E. The structure of the ASAP core complex reveals the existence of a Pinin-containing PSAP complex. Nat. Struct. Mol. Biol. 19, 378–386 (2012).

    Article  CAS  PubMed  Google Scholar 

  45. Mazza, C., Ohno, M., Segref, A., Mattaj, I. W. & Cusack, S. Crystal structure of the human nuclear cap binding complex. Mol. Cell 8, 383–396 (2001).

    Article  CAS  PubMed  Google Scholar 

  46. Baejen, C. et al. Transcriptome maps of mRNP biogenesis factors define pre-mRNA recognition. Mol. Cell 55, 745–757 (2014).

    Article  CAS  PubMed  Google Scholar 

  47. Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).

    Article  PubMed  ADS  Google Scholar 

  48. Taniguchi, I. & Ohno, M. ATP-dependent recruitment of export factor Aly/REF onto intronless mRNAs by RNA helicase UAP56. Mol. Cell. Biol. 28, 601–608 (2008).

    Article  CAS  PubMed  Google Scholar 

  49. Yan, Q. et al. Proximity labeling identifies a repertoire of site-specific R-loop modulators. Nat. Commun. 13, 53 (2022).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  50. Fan, J. et al. Exosome cofactor hMTR4 competes with export adaptor ALYREF to ensure balanced nuclear RNA pools for degradation and export. EMBO J. 36, 2870–2886 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Schmidt, U. et al. Assembly and mobility of exon–exon junction complexes in living cells. RNA 15, 862–876 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Derrer, C. P. et al. The RNA export factor Mex67 functions as a mobile nucleoporin. J. Cell Biol. 218, 3967–3976 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Ben-Yishay, R. et al. Imaging within single NPCs reveals NXF1’s role in mRNA export on the cytoplasmic side of the pore. J. Cell Biol. 218, 2962–2981 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Riback, J. A. et al. Viscoelastic RNA entanglement and advective flow underlie nucleolar form and function. Preprint at bioRxiv https://doi.org/10.1101/2021.12.31.474660 (2022).

  55. Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).

    Article  CAS  PubMed  Google Scholar 

  56. Koch, B. et al. Generation and validation of homozygous fluorescent knock-in cells using CRISPR–Cas9 genome editing. Nat. Protoc. 13, 1465–1487 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Muhar, M. et al. SLAM-seq defines direct gene-regulatory functions of the BRD4–MYC axis. Science 360, 800–805 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Suzuki, K., Bose, P., Leong-Quong, R. Y., Fujita, D. J. & Riabowol, K. REAP: a two minute cell fractionation method. BMC Res. Notes 3, 294 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Kastner, B. et al. GraFix: sample preparation for single-particle electron cryomicroscopy. Nat. Methods 5, 53–55 (2008).

    Article  CAS  PubMed  Google Scholar 

  60. Mastronarde, D. N. Automated electron microscope tomography using robust prediction of specimen movements. J. Struct. Biol. 152, 36–51 (2005).

    Article  PubMed  Google Scholar 

  61. Tegunov, D. & Cramer, P. Real-time cryo-electron microscopy data preprocessing with Warp. Nat. Methods 16, 1146–1152 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Scheres, S. H. W. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).

    Article  CAS  PubMed  Google Scholar 

  64. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D 66, 486–501 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D 60, 2126–2132 (2004).

    Article  PubMed  Google Scholar 

  66. Croll, T. I. I. S. O. L. D. E. A physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr. D 74, 519–530 (2018).

    Article  CAS  Google Scholar 

  67. Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D 66, 213–221 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Afonine, P. V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr. D 74, 531–544 (2018).

    Article  CAS  Google Scholar 

  69. Goddard, T. D. et al. UCSF ChimeraX: meeting modern challenges in visualization and analysis: UCSF ChimeraX Visualization System. Protein Sci. 27, 14–25 (2018).

    Article  CAS  PubMed  Google Scholar 

  70. Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).

    Article  CAS  PubMed  Google Scholar 

  71. Zivanov, J., Nakane, T. & Scheres, S. H. W. Estimation of high-order aberrations and anisotropic magnification from cryo-EM data sets in RELION-3.1. IUCrJ 7, 253–267 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Webb, B. & Sali, A. in Functional Genomics Vol. 1654 (eds Kaufmann, M., Klinger, C. & Savelsbergh, A.) 39–54 (Springer, 2017).

  73. Hagen, W. J. H., Wan, W. & Briggs, J. A. G. Implementation of a cryo-electron tomography tilt-scheme optimized for high resolution subtomogram averaging. J. Struct. Biol. 197, 191–198 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Mastronarde, D. N. & Held, S. R. Automated tilt series alignment and tomographic reconstruction in IMOD. J. Struct. Biol. 197, 191–198 (2017).

    Article  Google Scholar 

  75. Nicastro, D. et al. The molecular architecture of axonemes revealed by cryoelectron tomography. Science 313, 944–948 (2006).

    Article  CAS  PubMed  ADS  Google Scholar 

  76. Heumann, J. M., Hoenger, A. & Mastronarde, D. N. Clustering and variance maps for cryo-electron tomography using wedge-masked differences. J. Struct. Biol. 175, 288–299 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Krijthe, J. H. Rtsne: t-distributed stochastic neighbor embedding using Barnes-Hut implementation. R version 4.2.3 https://cran.r-project.org/web/packages/Rtsne/Rtsne.pdf (2015).

  78. The R Development Core Team. R: A Language and Environment for Statistical Computing version 4.2.3 (R Foundation for Statistical Computing, 2021).

  79. Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).

  80. Mendes, M. L. et al. An integrated workflow for crosslinking mass spectrometry. Mol. Syst. Biol. 15, e8994 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Lenz, S. et al. Reliable identification of protein–protein interactions by crosslinking mass spectrometry. Nat. Commun. 12, 3564 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  82. Piovesan, A., Caracausi, M., Antonaros, F., Pelleri, M. C. & Vitale, L. GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics. Database 2016, baw153 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Abràmoff, M. D., Magalhães, P. J. & Ram, S. J. Image processing with ImageJ. Biophot. Int. 11, 36–42 (2004).

    Google Scholar 

  84. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  85. Varadi, M. et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022).

    Article  CAS  PubMed  Google Scholar 

  86. Tegunov, D. High-resolution in situ imaging of biological samples with Warp and M. Microsc. Microanal. https://doi.org/10.1017/S1431927620023478 (2020).

Download references

Acknowledgements

We thank members of the Plaschka Group for their help and discussions; staff at the Protein Technologies Facility at the Vienna BioCenter Core Facilities (VBCF), a member of the Vienna BioCenter (VBC), for assistance with protein production; staff at the VBCF Electron Microscopy Facility, in particular T. Heuser and H. Kotisch, for support, data collection and maintaining facilities; V.-V. Hodirnau at the Institute of Science and Technology Austria EM facility for cryo-EM data collection; K. M. Davies and V. Vogirala for data collection at eBIC (Diamond Light Source) and I. Grishkovskaya for assistance; R. Zimmermann and his team for computational support; K. Mechtler and his team for MS; staff at the in-house Molecular Biology Service for reagents; staff at the VBCF Next Generation Sequencing Facility for Illumina sequencing; J. Ahel for help with scripting and M. Novatchkova for help with the tSNE analysis; members of the Haselbach and Balzarotti laboratories (IMP Vienna) for sharing reagents; S. Falk and J. Zuber for sharing reagents and expertise; A. Phillip for help with mammalian cell culture; the UCSF ChimeraX team, especially T. Goddard, for implementing new functions for tomography analysis; S. Ameres, C. Bernecky, J. Brennecke, L. Cochella, A. Pauli and A. Stark for discussions; and S. Ameres, C. Bernecky, J. Brennecke, D. Gerlich, E. Nogales, A. Pauli, J.-M. Peters, G. Riddihough (Life Science Editors) and A. Stark for critical reading of the manuscript. M.K.V. was supported by an EMBO Postdoctoral Fellowship. D.R.-B. was supported by a Marie Sklodowska-Curie fellowship (101028744). J.R. was supported by core funding from the Wellcome Trust (203149). C.P. was supported by Boehringer Ingelheim and the European Research Council (ERC-2020-STG 949081 RNApaxport). For the purpose of open access, the author has applied for a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Author information

Authors and Affiliations

Authors

Contributions

B.P.-F., M.K.V., L.F. and F.I.A. purified recombinant proteins and endogenous complexes. B.P.-F. and M.K.V. performed biochemical experiments. B.P.-F. and M.K.V. collected cryo-EM SPA data. B.P.-F., M.K.V. and C.P. analysed cryo-EM single-particle data and modelled the ALYREF–EJC–RNA structure. C.P. modelled the THO–UAP56 and TREX–mRNA structures. M.K.V. collected and analysed cryo-electron tomography data. D.R.-B., B.P.-F. and M.K.V. grew mammalian cells and made NEs. U.S. carried out RNA sequencing. M.K.V. crosslinked complexes. F.J.O. and J.R. produced and analysed the crosslinking MS data. B.P.-F., M.K.V. and C.P. analysed the data and prepared the manuscript with input from all authors. C.P. initiated and supervised the project.

Corresponding author

Correspondence to Clemens Plaschka.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Giulia Zanetti and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Biochemical characterization of TREX–EJC–RNA and ALYREF–EJC–RNA complexes.

a. Domain architecture of ALYREF constructs and their nomenclature used throughout. N- and C-UBM, N- and C-terminal UAP56-binding motif; RBD1 and RBD2, RNA-binding domain 1 and 2; RRM, RNA-recognition motif; MBP, Maltose Binding Protein; 3C, PreScission protease cleavage site; His, Histidine-tag. b. ALYREFN reconstitutes the EJC in vitro. Pulldown assay with MBP-ALYREFN (bait) incubated with EIF4A3, MAGOH–Y14 (residues 66–154), or both, with or without a 15 nucleotide long single stranded (ss) RNAs and/or AMP-PNP. Complex formation was determined by SDS-PAGE analysis with Coomassie blue staining. This exact experiment was done once, but similar results were obtained in two additional experiments either without AMP-PNP or without RNA. c. ALYREFN–EJC–RNA complexes form multimers. ALYREFN–EJC–RNA was assembled on 50 (top) or 15 nucleotides (nt) long single stranded RNAs (ssRNAs) (middle) and analyzed in sucrose density gradients. SDS-PAGE analysis with Coomassie blue staining of gradient fractions indicates multiple oligomeric states. The sucrose gradient sedimentation profile (bottom) is based on quantification of MAGOH band intensities. The sedimentation coefficients were estimated in CowSuite based on the predicted molecular weights of the different oligomeric states (an ALYREFN–EJC–RNA monomer is ~150 kDa). The sedimentation range of one to six ALYREFN–EJC–RNA complexes is indicated. We analyzed even fraction numbers and included fractions 7 and 15 to better resolve monomer and hexamer peaks using SDS-PAGE. Gradient conditions are specified on top. This exact experiment was done once, but ALYREFN–EJC–RNA multimerization was similarly observed in an experiment with different gradient ultracentrifugation parameters. d. The ALYREF WxHD domain is sufficient for EJC reconstitution. Pulldown assay with different MBP-ALYREF truncation constructs (see panel a) or MBP-CASC3SELOR as a bait and EIF4A3 and MAGOH–Y14 (residues 66–154) to probe EJC-reconstitution efficiency. Complex formation was determined by SDS-PAGE analysis with Coomassie blue staining. This experiment was done twice. For gel source data, see Supplementary Fig. 3. e. ALYREF55–182–EJC–RNA oligomers form in vitro, are resistant to RNase treatment, and do not require the ALYREF RBD1 and UBM domains. The ALYREF55–182–EJC–RNA complex was assembled on 15 nt ssRNA and treated (bottom) or not treated (top) with 20 µg benzonase mL−1 to digest protein-unbound RNA. The complexes were then analyzed in sucrose density gradients. SDS-PAGE analysis with Coomassie blue staining of gradient fractions indicates indistinguishable oligomeric sedimentation profiles of the ALYREF55–182–EJC–RNA complex, with or without benzonase digestion. The sucrose gradient sedimentation profile (bottom) is based on quantification of MAGOH band intensities. The hexamer peak (confirmed by negative staining, see panel f) is indicated with a grey box. Gradient conditions are specified. This experiment was done twice, the second time with 2 µg benzonase mL−1. f. Negative stain 2D class averages show that MBP-ALYREF55–182–EJC–RNA (15 nt) complexes form trimeric (top) and hexameric (bottom) complexes. Cartoon interpretations are shown on the right. Scale bar, 250 Å. g. Ribbon model showing the location of mutated residues in ALYREF in the ALYREF-EJC interface. Mutated residues are shown as sticks and Cα-spheres colored by the ALYREF–EJC interface. h. ALYREF–EJC interface mutations in the ALYREF55–182 constructs reduce the efficiency of EJC reconstitution. The pulldown assay was carried out as in panel d. Mutated residues are indicated in panel g. Complex formation was determined by SDS-PAGE analysis with Coomassie blue staining. This experiment was done twice. i. Mutation of ALYREF in the ALYREF–EJC interfaces impairs ALYREF–EJC–RNA complex oligomerization in vitro. ALYREFM-b and ALYREFM-c+∆d mutants were made in the ALYREF55–182 construct (see panel g for mutant details). Wild-type or mutant ALYREF55–182 or the isolated CASC3SELOR were used to assemble EJC–RNA complexes on a 15 nt long RNA and analyzed in sucrose density gradients for their multimerization. SDS-PAGE analysis with Coomassie blue staining of gradient fractions indicates loss of high-order oligomers in the sedimentation profiles of ALYREF mutants, which resemble the pattern of the monomeric CASC3SELOR. The sucrose gradient sedimentation profile (bottom) is based on quantification of MAGOH band intensities. The hexamer peak is indicated with a grey rectangle. Gradient conditions are specified on top. This exact experiment was done once. j. The in vivo mutation of ALYREF in the ALYREF–EJC interface (mutant ALYREFM-c+∆ d; see panel g for details) impairs its interaction with mRNP components. Wild-type FLAG-tagged ALYREFWT or the FLAG-ALYREFM-c+∆ d mutant were ectopically overexpressed in K562 cells, which also ectopically overexpressed THOC1-GFP. The two cell lines were used to prepare nuclear extract (NE), which were then treated with benzonase for 16 h at 4 °C, including a final concentration of 5 mM MgCl2. The benzonase-treated extracts were then applied to anti-FLAG M2 resin for purification. Western blot analysis shows wild-type ALYREF or the mutant ALYREFM-c+∆ d (via their FLAG-tag), NCBP1, and EIF4A3. This experiment was done twice. k. The THO–UAP56 complex does not form a complex with ALYREFN in sucrose density gradients, suggesting that UAP56 binds the ALYREF UBM with low affinity as observed in yeast30. SDS-PAGE stained with Coomassie blue. The sucrose gradient sedimentation profile (bottom) is based on quantification of THOC2 and ALYREF band intensities. Gradient conditions are specified on top. This experiment was done twice. For gel source data, see Supplementary Fig. 4. l. In vitro reconstitution of TREX–EJC–RNA. The recombinant proteins or sub-complexes were mixed as shown in Fig. 1d and applied to sucrose density gradient ultracentrifugation. SDS-PAGE analysis with Coomassie blue staining confirms the formation of a complex containing all eleven proteins subunits and a sedimentation coefficient of ~75 S. Gradient conditions are specified on top. This experiment was done four times. m. The ALYREF-UBM–UAP56 interaction is required to form the TREX–EJC–RNA complex in vitro. Sucrose gradient sedimentation profiles of (from top to bottom): ALYREFN–EJC–RNA, ALYREF55–182–EJC–RNA, THO–UAP56, THO–UAP56 with ALYREFN–EJC–RNA, and THO–UAP56 with ALYREF55–182–EJC–RNA. Gradient fractions were analyzed by SDS-PAGE and Coomassie staining. Below, sucrose gradient sedimentation profiles are based on quantifications of the EJC subunit MAGOH and THO complex subunit THOC2 band intensities. MAGOH intensities were multiplied by a factor of 3 for better visualization. Gradient fractions containing ALYREF–EJC–RNA (light grey), THO–UAP56 (light grey), or TREX–EJC–RNA (grey) are shown with rectangles. Gradient conditions are specified on top. This experiment was done five times. n. A monomeric THO complex (THOMonomer) does not form TREX–EJC–RNA complexes in vitro. Sucrose gradient sedimentation profiles of THOMonomer–UAP56 (see Methods for details) alone or in presence of ALYREFN–EJC–RNA, assembled on a 15 nt ssRNA. Gradient fractions were analyzed by SDS-PAGE and Coomassie staining. THOMonomer–UAP56 did not form TREX–EJC–RNA complexes (compare to panel m). Below, sucrose gradient sedimentation profiles are based on quantifications of the EJC subunit MAGOH and THO complex subunit THOC2 band intensities. MAGOH intensities were multiplied by a factor of 3 for better visualization. Gradient conditions are specified on top. This experiment was done twice.

Extended Data Fig. 2 ALYREF–EJC–RNA and TREX–EJC–RNA complex cryo-EM image processing and structural details.

a. Denoised cryo-EM micrographs of ALYREF55–182–EJC–RNA (left) and TREX–EJC–RNA (right) complexes (see Methods). Scale bar, 500 Å. The ALYREF55–182–EJC–RNA dataset contained 7,891 micrographs and the TREX–EJC–RNA dataset 12,938 micrographs, respectively. b. TREX–EJC–RNA complexes contain multiple THO–UAP56 complexes, caging in a central ALYREFN–EJC–RNA complex. Single TREX–EJC–RNA particles from a denoised cryo-EM micrograph can contain two (left) or three (right) THO–UAP56 complexes. In 2D class averages, the THO–UAP56 complexes blur out, because the central ALYREFN–EJC–RNA complex is aligned (bottom). c. Three-dimensional image classification tree of ALYREF55–182–EJC–RNA (left) and TREX–EJC–RNA (right) cryo-EM data. The ALYREF55–182–EJC–RNA dataset contained 7,891 micrographs from which 2,139,936 particles were picked and extracted. Three initial volumes were generated from 100,000 particles in cryoSPARC63 using the ab-initio reconstruction algorithm, which served as reference volumes to classify the entire dataset using three rounds of heterogenous classification (see Methods). The final particle stack contained 1,564,602 particles and was refined to 2.4 Å using D3 symmetry. The TREX–EJC–RNA dataset contained 1,050,740 particles, which were classified using initial volumes obtained from the ALYREF55–182–EJC–RNA dataset and from ab initio reconstructions. After 3D classification, 3D refinement and application of D3 symmetry in cryoSPARC63 yielded a 3.0 Å resolution map from 383,520 particles. The type of mask is indicated for each 3D refinement. Please refer to Methods for further details. d. ALYREF55–182–EJC–RNA and TREX–EJC–RNA give rise to indistinguishable 2D classes and reconstructions (top left and right), apart from the higher resolution of the ALYREF55–182–EJC–RNA dataset (bottom left), which contains more particles. e. Representative protein (MAGOH, top) and RNA (bottom) densities from the 2.4 Å resolution ALYREF55–182–EJC–RNA map. f. Orientation distribution plots for all particles contributing to the ALYREF55–182–EJC–RNA and TREX–EJC–RNA cryo-EM map, visualized in cryoSPARC63. g. Gold-standard Fourier shell correlation (FSC = 0.143) of the ALYREF55–182–EJC–RNA and TREX–EJC–RNA cryo-EM maps. h. Cryo-EM densities for ALYREF–EJC interface residues from the 2.4 Å ALYREF55–182–EJC–RNA map. i. Multiple sequence alignment showing the conservation of ALYREF–EJC interface residues in ALYREF, EIF4A3, and MAGOH using human (H.s.), Danio rerio (D.r.), Drosophila melanogaster (D.m.), Caenorhabditis elegans (C.e.), Arabidopsis thaliana (A.t.), and Schizosaccharomyces pombe (S.p) sequences. A different type of arrow is used to indicate residues of interfaces b, c, and d.

Extended Data Fig. 3 Comparisons of ALYREF–EJC interaction details with a viral ALYREF–ORF57 complex, the cytoplasmic CASC3–EJC–RNA complex, and the EJC-bound P-complex spliceosome.

a. Organization of ALYREF. Top: Structural model of full-length ALYREF predicted with AlphaFold84,85. Annotated domains (N-UBM, WxHD motif, RRM and C-UBM) are colored in darker shades of purple. Spheres represent backbone atoms of glycine and arginine residues in the RBD domains. Middle: ALYREF domain diagram. Black bars indicate residues that are included as an atomic model in this study. Bottom: AlphaFold per residue confidence score (pLDDT) plot. High values are indicative of high confidence predictions, whereas low values represent residues that are likely disordered in solution. b. Comparison of the ALYREF RRM domain interaction with the EJC subunit MAGOH (interface c, left) and the Herpes simplex virus ORF57 (right)33. ALYREF binds viral ORF57 differently compared to the overlapping ALYREF–EJC interface c. This supports a general model that ALYREF can use multiple interfaces to engage either viral proteins, such as ORF57, or mRNP maturation marks, such as the CBC or EJCs, and may enable ALYREF to broadly select its RNA targets. c. Details of the WxHD motifs binding to the EJC. Left: Modelling of apo EIF4A3 bound to the WxHD motif indicates a clash with EIF4A3 residue Y202, suggesting that ALYREF can only bind to RNA-bound EJC (see Supplementary Video 2). Middle: the same view, showing the ALYREF WxHD motif bound to RNA-bound EJC (this study). Right: the same view, showing the CASC3 WxHD motif bound to RNA-bound EJC23, revealing conserved binding modes of ALYREF and CASC3. d. The ALYREF WxHD and RRM domains binds the same interfaces between EIF4A3 and MAGOH as the CASC3SELOR domain. Top: Overview image of the ALYREF–EJC–RNA structure (left) and comparison of the binding modes of ALYREF and CASC3 (middle and right, respectively). Bottom: Sequence alignment of ALYREF (top) and CASC3 (bottom), showing the conserved WxHD motif and an additional short conserved motif (QEL[F/I]Ax[F/Y]G), which is however not contacting the EJC in the ALYREF–EJC–RNA structure. Conserved (dark blue) and partially conserved (light blue) residues are indicated with boxes. Residues in ALYREF and CASC3 contacting the EJC are indicated. e. Superposition of the ALYREF–EJC–RNA complex (this study) onto the human P-complex spliceosome cryo-EM structure (PDB ID 6QDV)32, via their EJC EIF4A3 subunits. This model reveals that higher order ALYREF–EJC complexes such as the ALYREF–EJC dimer are not possible when the EJC is still bound by the spliceosomes, as the P-complex subunit SNU114 clashes with the RRM in an ALYREF–EJC dimer. In addition, SNU114 likely disfavors binding of a single molecule of ALYREF to the EJC, as there is a steric clash with the N-terminal ordered ALYREF residue (Asp 85) in the ALYREF–EJC structure.

Extended Data Fig. 4 Endogenous TREX–mRNP complex purification strategies, biochemical characterization, and negative stain EM.

a. Endogenous TREX–mRNP complexes were obtained via affinity purification of ectopically overexpressed THOC1-GFP in K562 cell nuclear extract (NE), which underwent a mild nuclease treatment. Purified TREX–mRNPs sediment at ~90–100 S in a sucrose density gradient. Individual fractions were analyzed by SDS-PAGE and S-values were estimated using CowSuite. This experiment was done more than ten times. b. Mass spectrometry analysis of endogenous TREX–mRNP complexes shows the 11 members of TREX and EJC within the top 12 hits. The relative abundance of each protein was estimated by summing up the peak areas of the top three peptides. Asterisks indicate tubulin proteins, which are abundant cellular proteins that are common purification contaminants. See Supplementary Table 1 for a complete list of identified proteins. c. TREX–mRNP purification yields the same protein composition using different strategies: (i) two different cell lines, ectopic THOC1-GFP overexpression (Lenti O/E) versus endogenous GFP-THOC5 CRISPR/Cas9-tagging (Endo), (ii) nuclear extract preparation methods, rapid cell fractionation (RCF) versus the standard nuclear extract preparation protocol (see Methods for details) or (iii) without and with mild nuclease digestion with benzonase. SDS-PAGE gels after affinity purification using GFP-trap resin and elution with 3C protease are shown. The experiment comparing RCF versus standard nuclear extract preparation protocols was done once. The comparison between THOC1-3C-GFP Lenti O/E and GFP-3C-THOC5 Endo nuclear extracts was done twice. The comparison between benzonase and non-benzonase treatments was done eight times. For gel source data, see Supplementary Fig. 5. d. SRSF1 is phosphorylated in endogenous TREX–mRNP complexes. Western blot analysis of SRSF1 in purified TREX–mRNPs before (lane 1) and after (lane 2) treatment with lambda phosphatase. Phosphorylated SRSF1 migrates slower during SDS-PAGE-PAGE and is less efficiently recognized by the anti-SRSF1 antibody. This experiment was done four times. For gel source data, see Supplementary Fig. 6. e. NXF1 is absent from purified TREX-mRNPs. Western blot showing protein levels of THOC1, NXF1, EIF4A3 and the proteasome subunit PSMA7 control in input (standard nuclear extract) and affinity purified TREX–mRNPs. While THOC1 and EIF4A3 are enriched in TREX–mRNPs, NXF1 and the proteasome are not. NXF1–NXT1 may be absent from TREX–mRNPs either due to a low affinity interaction with TREX–mRNPs or because it associates after an additional mRNP remodelling step. The experiment was done twice. For gel source data, see Supplementary Fig. 7. f. Mild nuclease treatment is required to obtain well-separated TREX–mRNP particles for electron microscopy. The nuclease activity of benzonase was reduced by omitting Magnesium from the buffer. Negative stain EM micrographs of TREX–mRNPs purified from nuclear extract either without (left) or with (right) mild nuclease treatment show that non-treated TREX–mRNP particles more frequently clump together. This experiment was done once. Scale bar, 200 Å. g. Purified TREX–mRNPs without (top) or with (bottom) mild nuclease treatment show identical negative stain EM 2D class averages. TREX complexes are indicated on the 2D classes using green arrow heads, showing that in both conditions single and multiple TREX complexes bound to a globular mRNP density. Scale bar, 200 Å. h. Purified TREX–mRNPs without (left) or with (right) mild nuclease treatment show identical negative stain EM 3D reconstructions. Scale bar, 200 Å. i. Nuclease treatment does not affect TREX–mRNP particle diameter or shape when visualized with negative stain EM. Left: Violin plot of TREX–mRNP particle diameters measured on negative stain electron micrographs. Horizontal bars indicate 25th (grey), 50th (black) and 75th (grey) percentiles. Nuclease-treated (n = 259) or untreated (n = 245) particles are not significantly different (Welch’s t-test, p = 0.91). Right: Particle roundness, calculated by dividing the length of the shortest axis of each particle by the length of the longest axis, is also not significantly different (Welch’s t-test, p = 0.82).

Extended Data Fig. 5 Endogenous TREX–mRNP complex cryo-EM image processing, reconstructions, and biochemistry of UAP56–ALYREF.

a. Three-dimensional image classification tree of endogenous TREX–mRNP complex cryo-EM data27,58. The complete data set contained 840,469 TREX-mRNP particles, which were classified in multiple rounds of 3D classification (with regularization parameter T = 4 for all RELION classifications) and focused refinement in RELION62,71. The best particles were used to extract symmetry related dimers, separately, yielding 415,848 dimer particles, which were further classified and refined in cryoSPARC63. This yielded maps A (cyan), B (light green), and C (slate blue) (see Methods for details). The percentage of TREX–mRNP particles (black) or TREX dimer units (orange) contributing to each class are provided. The type of mask and overall resolution is indicated for each 3D refinement. b. Gold-standard Fourier shell correlation (FSC = 0.143) of the TREX–mRNA cryo-EM maps A, B, and C. c. Orientation distribution plots for all particles contributing to the TREX–mRNA cryo-EM maps A, B, and C, visualized in cryoSPARC63. d. The composite TREX–mRNA cryo-EM density is shown from front and left side views (maps A, B, and C), and colored by local resolution as determined by cryoSPARC63. e. The composite TREX–mRNA cryo-EM density (maps A, B, and C) is shown opposite of the refined TREX–mRNA coordinate model, which is shown as ribbons and colored as in Fig. 2d. f. Gallery of TREX–mRNA complex subunits THOC1, THOC5 (tRWD domain), and THOC6 are shown superimposed on their respective cryo-EM densities. Below each protein a representative segment of the protein is superimposed on the respective cryo-EM density. g. The TREX monomer A is mobile in the TREX–mRNA complex. Two densities obtained from 3D variability analysis (class 3 in grey and class 8 in green) are overlayed, revealing that monomer A can shift globally by ~25 Å. This mobility can explain why monomer A, and the associated UAP56 molecule, have a low local resolution. h. The TREX–mRNA map reveals density for the UAP56 RecA1 lobe, the ALYREF UBM, and putatively assigned mRNA, which were fitted as a single rigid body of a yeast Yra1–Sub2–RNA homology model (5SUP). The ALYREF UBM, which could be either N- or C-terminal, is visible at lower density threshold, and was modelled as the C-UBM based on its position in the yeast Yra1 (C-UBM)–Sub2–RNA crystal structure and an AlphaFold2 mulitmer model84 of the ALYREF C-UBM bound to human UAP56. i. Mutation of human UAP56 residues at the ALYREF-UBM to UAP56 interface, supports the ALYREF-UBM density assignment. Top: Interface mutations are mapped onto the UAP56 coordinate model and labelled. Bottom: In vitro, a fluorescently labeled ALYREF C-UBM peptide binds to wildtype UAP56 but not mutated UAP56. This experiment was done once. For gel source data, see Supplementary Fig. 8. j. Comparison of human ALYREF-UBM–UAP56–RNA (this study) and yeast Yra1-UBM–UAP56–RNA–ATP-analog (5SUP)40 structures. k. An RNA filter-binding assay suggests that the ALYREF RNA binding domains 1 and 2 (RBD1 and RBD2) might assist RNA delivery to UAP56, but not the isolated ALYREF55–182 construct that forms EJC contacts (see Fig. 1, Extended Data Fig. 1). Left: Boundaries of protein constructs used for RNA affinity measurements using filter binding assays. Middle: Binding curves of the tested constructs. The plot shows mean values from n = 6 measurements, error bars indicate the standard deviation of each measurement, and solid or dotted lines show the fit of a “Specific binding with Hill-slope”-function to the data, with the Bmax constrained to 1 as implemented in GraphPad Prism (see Methods). Right: Measured dissociation constants (KD) of the tested constructs as determined by the fits in the middle panel; spheres indicate the KD determined form the fit and error bars indicate the 95% confidence interval determined from the fit. UAP56-RNA binding is not detectable with isolated UAP56 in absence of ATPγS, but does bind RNA with KD of ~900 nM (95% confidence interval: 810–1,014 nM) in presence of 1 mM ATPγS. The ALYREF-RNA binding activity is contained in its RBD1 and RBD2 domains, but not in the WQHD motif or RRM domain. These experiments were done twice, with three technical replicates each.

Extended Data Fig. 6 Recombinant THO–UAP56 complex cryo-EM image processing and reconstructions.

a. Three-dimensional image classification tree of the in vitro reconstituted THO–UAP56 cryo-EM data set27. The symmetry-expanded data set contained 314,583 high-quality particles. Classification and focused refinements in cryoSPARC63 yielded maps D (pink) and E (green) (see Methods for details). The percentage of THO–UAP56 dimer units contributing to each class is provided. The type of mask and overall resolution is indicated for each 3D refinement. b. Gold-standard Fourier shell correlation (FSC = 0.143) of the cryo-EM maps D and E. c. Orientation distribution plots for all particles contributing to cryo-EM maps D and E, visualized in cryoSPARC63. d. THO–UAP56 complex monomer A composite cryo-EM densities from front and left side views (maps D and E), colored by local resolution as determined by RELION 3.162,71. e. Representative regions of the newly determined THO–UAP56 cryo-EM densities (top) in comparison to previous data (bottom)27. The new densities are superimposed on the updated and refined THO–UAP56 coordinate model. Segments of THOC2 residues 316–330, residues 576–590, and THOC3 residues 163–170 are shown. f. A new model of the human THO–UAP56 complex. Newly modelled regions are shown in yellow, and contain segments of THOC1, THOC2, and THOC3. Regions with newly modelled sidechains are colored orange and are built on the previously available backbone models of THOC2 and THOC3. This updated model reveals new contacts among THOC1, −2, and, −3 subunits. The newly built THOC1 C-terminus meanders along the length of the THOC2 subunit ‘bow’, ‘MIF4G’, and ‘stern’ domains (Fig. 2e). The THOC1 C-terminal residues (458–528) were initially modelled using AlphaFold (Methods)84,85. The THOC2 ‘anchor’ forms a 5-helix bundle that packs against THOC5 helix α2 and THOC7 helices α2 and α3, and the THOC3 β-propeller blades 3 and 4 make a stabilizing contact with THOC2 ‘bow’ loop α17-α18 (Fig. 2e). Unchanged regions are colored grey and green and contain modelled backbones or sidechain, respectively.

Extended Data Fig. 7 Crosslinking mass spectrometry of endogenous TREX–mRNPs.

a. Crosslinks mapped onto TREX monomers 1A and 1B. Monomer 1A and 1B are shown as transparent surfaces and crosslinks are colored according to the Cα-Cα distance of crosslinked residues. Symmetry related monomers 2A and 2B are shown in ribbon representation and colored as in Fig. 2d. Crosslinks that span more than 30 Å may be explained through proximity between TREX complexes on mRNPs, as observed in our cryo-ET data. The data was generated from two purification and crosslinking experiments, which were merged for data analysis (see methods). b. Crosslinks mapped onto the ALYREF–EJC–RNA protomer structure. c. Crosslinks mapped onto the ALYREF–EJC–RNA dimer structure are similarly compatible both with inter EJC-EJC (dimer) as well as with intra-EJC crosslink distances (protomer, panel b). Crosslinks spanning less than 30 Å are shown. d. The ALYREF–MAGOH crosslinks mapped onto a model generated by superposing the ALYREF AlphaFold model onto the ALYREF-RRM. ALYREF residues in the AlphaFold model that are absent from the ALYREF–EJC–RNA structure are shown as transparent ribbons. e. Histograms and pie charts of Cα-Cα distances of crosslinked residues in the TREX structure. f. As panel e, but for the ALYREF–EJC–RNA structure. g. Protein-protein interaction network based on crosslinks of TREX–mRNPs after a one-step purification without nuclease digestion. Note that ribosomal proteins are common contaminants. The thickness of the grey lines connecting proteins scales with the number of unique crosslinked residue pairs.

Extended Data Fig. 8 TREX–mRNP cryo-tomography analysis.

a. Tilt-series pre-processing, tomogram reconstruction, template matching and particle classification. Tilt series movie frames were pre-processed using Warp61,86 and aligned in imod74 and tomograms were reconstructed in Warp with a pixel size of 10 Å/px (see methods for details). Template matching and subtomogram reconstruction were performed in Warp. Two independent rounds of template matching and particle classification were performed; for the first round (left hand side), template matching was performed against raw tomograms using a reference volume from our single particle analysis of the endogenous TREX complex (this study). 242,237 subtomograms were extracted and classified into four classes using RELION62,71, and the regularization parameter was set to T = 4 for all classification runs. The best class (12% of extracted subtomograms) was denoised and used to perform template matching with denoised tomograms as search targets (right branch). This yielded 59,275 subtomograms, and particle classification was performed as before. In the next step, the overlap of good particles from both branches was taken as a high-confidence set and these particles were used to generate a reference-free volume to exclude potential reference bias in the final reconstruction. The obtained volume was used to further classify the combined particles from both picking strategies using three subsequent rounds of 3D classification. In the last round, a combined 10,105 sub-tomograms in classes 2, 3, and 4 contained the TREX complex and less than 1% of particles (class 1) gave rise to a ‘junk’ class, showing that classification had converged. The insets show zoom-ins of two classes that reveal unambiguous, low-resolution density for the UAP56 RecA1 lobe (monomer B or monomer A, respectively). b. Subtomogram average (STA) map of endogenous TREX–mRNPs with TREX density in green and mRNP density in grey. Insets show zoom-ins on the THO complex scaffold subunits (THOC5, −6, and −7), revealing an excellent fit of the TREX structure to the STA map and density features consistent with the resolution estimate (13 Å), such as the “hole” in the THOC6 WD40 density. c. Example of a reconstructed tomogram before denoising. d. The same tomogram as shown in panel c after denoising. e. The same tomogram as in panel d, but with TREX positions (green densities) obtained from STA overlayed. f. Gold-standard Fourier shell correlation (FSC = 0.143) curve for the STA reconstruction with three different masks: (1) either a wide mask encompassing the C2 symmetric entire TREX complex (dotted line, 17 Å), (2) a tight mask encompassing the “scaffold” made from THOC5, −6, and −7 (15 Å), or (3) a tight mask around monomer B (THOC 1/2/3) (13 Å). g. Size comparison between a representative TREX–mRNP and the dilated human nuclear pore complex (PDB 7R5J). Visually identified TREX density in the TREX–mRNP particle is colored green, and mRNP density is colored grey.

Extended Data Fig. 9 Analysis of TREX-pairs on mRNPs.

a. Real-space representation of aligned TREX pairs (n = 275) shown from two views. The reference TREX (TREX-A) is shown as a ribbon representation, and all TREX-Bs are shown as a sphere placed at the TREX-B center. Spheres are colored by TREX-A to -B distances. b. Projection of TREX-B coordinates onto a 2D plane, colored as in A. θ and φ describe the angular component of a vector connecting TREX-A with TREX-B. c. Heatmap of TREX–TREX positions (expressed as θ and φ). d. Violin plot of TREX-A–TREX-B distances, measured from center-to-center or between the two closest atoms. e. Violin plot of rotation angles around the X, Y and Z axis that would align TREX-A with TREX-B. f. Violin plot of TREX mRNP particle volumes measured for particles with more than two TREX complexes per mRNP in our stringently classified dataset or of random TREX–mRNP particles. No significant difference was found (Welch’s t-test, p = 0.0874). g. Violin plot of TREX mRNP particle sphericity measured for particles with more than two TREX complexes per mRNP in our stringently classified dataset or of random TREX–mRNP particles. No significant difference was found (Welch’s t-test, p = 0.3162). h. Scatter plot of TREX–mRNP volume versus sphericity (n = 323). i. Analysis of TREX-A to -B contacts (defined as atoms of TREX-A within 10Å to TREX-B) as observed for TREX pairs on endogenous mRNPs. TREX residues are colored by their proximity frequency, with atoms never in proximity to TREX-B in blue-green and atoms frequently in proximity in bright yellow. j. Analysis of THO–UAP56 contact sites (defined as atoms of THO–UAP56-A within 10 Å to THO–UAP56-B) as observed for the in vitro THO–UAP56 structure27. Atoms within 15 Å to the second copy are colored bright yellow.

Extended Data Fig. 10 Probing protein accessibility in endogenous mRNP complexes.

a. Schematic of the experiment to probe protein accessibility in mRNP complexes. The nuclear or cytoplasmic extract from K562 cells, tagged homozygously and endogenously with either GFP-3C-THOC5 or GFP-3C-EIF4A3, was incubated with a fluorescently labelled (fluorophore: AF647) 15 kDa anti-GFP nanobody. The extracts were then applied to a sucrose density gradient to separate free proteins from mRNPs, which migrate in heavy (later) sucrose gradient fractions. The gradient fractions were analyzed by SDS–PAGE. Due to high affinity of the anti-GFP nanobody to GFP (~1 pM), the nanobody stays bound to the GFP fusion during gel electrophoresis (see Methods for details). Fluorescence imaging allows quantification of the respective sedimentation profiles for the GFP fusion proteins (GFP-THOC5 or GFP-EI4A3, green channel) and the anti-GFP nanobody-bound fusion proteins (red channel, colored in magenta). When the GFP-tagged protein is accessible in mRNPs, then the anti-GFP nanobody signal closely follows the profile of the GFP-tagged protein. In contrast, when a GFP-tagged protein is inaccessible in mRNPs, the anti-GFP nanobody signal follows the GFP signal in early (light) sucrose gradient fractions that contain free proteins but shows reduced intensity in later (heavy) fractions. b. The anti-GFP nanobody signal closely follows the GFP-THOC5 signal, showing that GFP-THOC5 is accessible in mRNP complexes. Shown is the fluorescence signal from SDS-PAGE gels of GFP-THOC5 nuclear extract incubated with the AF647-labeled anti-GFP nanobody (top) and normalized sedimentation profiles (bottom). Sedimentation plots show mean normalized intensity values determined from three gels (solid lines) and standard deviations (transparent areas). The grey box indicates the peak gradient fractions of purified TREX–mRNPs (see Extended Data Fig. 4). This experiment was done four times. For gel source data, see Supplementary Fig. 9. c. As for panel b, but for GFP-EIF4A3 in nuclear extract. In the high molecular weight fractions of the sucrose density gradient, GFP-3C-EIF4A3 is poorly accessible to the anti-GFP nanobody. This experiment was done four times. For gel source data, see Supplementary Fig. 10. d. As for panel b, but for GFP-EIF4A3 in cytoplasmic extract. In the high molecular weight fractions of the sucrose density gradient, GFP-EIF4A3 remains accessible to the anti-GFP nanobody, in contrast to GFP-EIF4A3 in nuclear extract, which is shown in panel c. This experiment was done twice. For gel source data, see Supplementary Fig. 11. e. Western blot experiment that shows the different depletion efficiencies of THOC1-GFP (ectopically overexpressed; Lenti O/E), GFP-THOC5 (endogenously tagged; endo), or GFP-EIF4A3 (endogenously tagged; endo) from nuclear extract using GFP-Trap resin (containing an anti-GFP nanobody coupled to 90 µm agarose beads) after three rounds of depletion. While THOC1-GFP and GFP-THOC5 are completely depleted in the supernatant, GFP-EIF4A3 is very inefficiently depleted. Anti-PSMA7 blots (a proteasome subunit) serve as loading controls. These experiments were done three times. For gel source data, see Supplementary Fig. 12. f. Cartoon model showing the position and nanobody-accessibility of GFP-tagged THOC5 or EIF4A3 in TREX–mRNPs, based on the accessibility to the anti-GFP nanobody and anti-GFP resin in panels b, c, and e.

Extended Data Table 1 Cryo-EM data collection and refinement statistics

Supplementary information

Supplementary Information

Supplementary Notes 1 and 2 contain details of the structure of the ALYREF(55–182)–EJC–RNA complex and TREX assignments in cryo-electron tomography data. Supplementary Figs. 1–12 show uncropped images of gels and western blots.

Reporting Summary

Supplementary Table 1

MS of TREX–mRNPs. Table of proteins identified by MS after a one-step purification of TREX–mRNPs through a THOC1–GFP pull down (sheet 1), after a THOC1–GFP pull down followed by an additional sucrose gradient ultracentrifugation step (sheet 2), or after a mild nuclease treatment of the NE and followed by THOC1–GFP pull down and a sucrose gradient ultracentrifugation step (sheet 3).

Supplementary Table 2

RNAs detected in TREX–mRNPs by QuantSeq. Percentage ReadCounts per gene types present in QuantSeq 3′ mRNA sequencing libraries prepared using RNA extracted from TREX–mRNPs. Gene-type analysis was restricted to genes with >1 ReadCount in all three replicates.

Supplementary Table 3

Crosslinking MS of TREX–mRNPs. Table of TREX–mRNP protein–protein crosslink MS results obtained using sulfo-SDA. A FDR threshold of 2% on the protein–protein interaction-level was applied.

Supplementary Video 1

Structure of an ALYREF(55–182)–EJC–RNA complex. Colours as in Fig. 1.

Supplementary Video 2

Assembly of the ALYREF(55–182)–EJC–RNA hexamer. The video describes how the EJC, comprising EIF4A3, MAGOH and Y14, assembles on RNA, and how ALYREF assembles an ALYREF–EJC–RNA hexamer using the EJC–EJC interface a, and the ALYREF–EJC interfaces b, c, and d. Colours as in Fig. 1.

Supplementary Video 3

Model for ALYREF-mediated EJC–RNA multimerization. ALYREF may organize the mRNA held between two neighbouring EJCs bound to the same mRNA.

Supplementary Video 4

Structure of the endogenous TREX–mRNA complex. Colours as in Fig. 2.

Supplementary Video 5

Model of a multivalent TREX–mRNP. Colours as in Fig. 3.

Supplementary Video 6

Example of an endogenous TREX–mRNP containing three TREX complexes. Three high-confidence TREX complexes coat the central mRNP, which is shown as an idealized sphere. Colours as in Fig. 4e.

Peer Review File

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pacheco-Fiallos, B., Vorländer, M.K., Riabov-Bassat, D. et al. mRNA recognition and packaging by the human transcription–export complex. Nature 616, 828–835 (2023). https://doi.org/10.1038/s41586-023-05904-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-023-05904-0

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing