Abstract
JTE-607 is an anticancer and anti-inflammatory compound and its active form, compound 2, directly binds to and inhibits CPSF73, the endonuclease for the cleavage step in pre-messenger RNA (pre-mRNA) 3′ processing. Surprisingly, compound 2-mediated inhibition of pre-mRNA cleavage is sequence specific and the drug sensitivity is predominantly determined by sequences flanking the cleavage site (CS). Using massively parallel in vitro assays, we identified key sequence features that determine drug sensitivity. We trained a machine learning model that can predict poly(A) site (PAS) relative sensitivity to compound 2 and provide the molecular basis for understanding the impact of JTE-607 on PAS selection and transcription termination genome wide. We propose that CPSF73 and associated factors bind to the CS region in a sequence-dependent manner and the interaction affinity determines compound 2 sensitivity. These results have not only elucidated the mechanism of action of JTE-607, but also unveiled an evolutionarily conserved sequence specificity of the mRNA 3′ processing machinery.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All deep sequencing data from the present study have been deposited in the Gene Expression Omnibus under series accession no. GSE218977. Source data are provided with this paper.
Code availability
The codes for machine learning analysis from the present study have been deposited to GitHub at https://github.com/angelamyu/C3PO.
References
Colgan, D. F. & Manley, J. L. Mechanism and regulation of mRNA polyadenylation. Genes Dev. 11, 2755–2766 (1997).
Chan, S., Choi, E. A. & Shi, Y. Pre-mRNA 3′-end processing complex assembly and function. Wiley Interdiscip. Rev. RNA 2, 321–335 (2011).
Shi, Y. Alternative polyadenylation: new insights from global analyses. RNA 18, 2105–2117 (2012).
Mitschka, S. & Mayr, C. Context-specific regulation and function of mRNA alternative polyadenylation. Nat. Rev. Mol. Cell Biol. https://doi.org/10.1038/s41580-022-00507-5 (2022).
Tian, B. & Manley, J. L. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 18, 18–30 (2016).
Mandel, C. R. et al. Polyadenylation factor CPSF-73 is the pre-mRNA 3′-end-processing endonuclease. Nature 444, 953–956 (2006).
Shi, Y. & Manley, J. L. The end of the message: multiple protein–RNA interactions define the mRNA polyadenylation site. Genes Dev. 29, 889–897 (2015).
Sheets, M. D., Ogg, S. C. & Wickens, M. P. Point mutations in AAUAAA and the poly(A) addition site: effects on the accuracy and efficiency of cleavage and polyadenylation in vitro. Nucleic Acids Res. 18, 5799–5805 (1990).
Ozsolak, F. et al. Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell 143, 1018–1029 (2010).
Liu, X. et al. Comparative analysis of alternative polyadenylation in S. cerevisiae and S. pombe. Genome Res. 27, 1685–1695 (2017).
Derti, A. et al. A quantitative atlas of polyadenylation in five mammals. Genome Res. 22, 1173–1183 (2012).
Palencia, A. et al. Targeting Toxoplasma gondii CPSF3 as a new approach to control toxoplasmosis. EMBO Mol. Med. 9, 385–394 (2017).
Begolo, D. et al. The trypanocidal benzoxaborole AN7973 inhibits trypanosome mRNA processing. PLoS Pathog. 14, e1007315 (2018).
Sonoiki, E. et al. A potent antimalarial benzoxaborole targets a Plasmodium falciparum cleavage and polyadenylation specificity factor homologue. Nat. Commun. 8, 1–11 (2017).
Sasaki, J. et al. Prior burn insult induces lethal acute lung injury in endotoxemic mice: effects of cytokine inhibition. Am. J. Physiol. Lung Cell Mol. Physiol. 284, L270–L278 (2003).
Uesato, N., Fukui, K., Maruhashi, J., Tojo, A. & Tajima, N. JTE-607, a multiple cytokine production inhibitor, ameliorates disease in a SCID mouse xenograft acute myeloid leukemia model. Exp. Hematol. 34, 1385–1392 (2006).
Jian, M. Y., Koizumi, T., Tsushima, K. & Kubo, K. JTE-607, a cytokine release blocker, attenuates acid aspiration-induced lung injury in rats. Eur. J. Pharmacol. 488, 231–238 (2004).
Ross, N. T. et al. CPSF3-dependent pre-mRNA processing as a druggable node in AML and Ewing’s sarcoma. Nat. Chem. Biol. 16, 50–59 (2020).
Kakegawa, J., Sakane, N., Suzuki, K. & Yoshida, T. JTE-607, a multiple cytokine production inhibitor, targets CPSF3 and inhibits pre-mRNA processing. Biochem. Biophys. Res. Commun. 518, 32–37 (2019).
Liu, H., Heller-Trulli, D. & Moore, C. L. Targeting the mRNA endonuclease CPSF73 inhibits breast cancer cell migration, invasion, and self-renewal. iScience 25, 104804 (2022).
Boreikaite, V., Elliott, T. S., Chin, J. W. & Passmore, L. A. RBBP6 activates the pre-mRNA 3′ end processing machinery in humans. Genes Dev. 36, 210–224 (2022).
Gutierrez, P. A., Baughman, K., Sun, Y. & Tong, L. A real-time fluorescence assay for CPSF73, the nuclease for pre-mRNA 3′-end processing. RNA 27, 1148–1154 (2021).
Sample, P. J. et al. Human 5′ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37, 803–809 (2019).
Wu, X. & Bartel, D. P. Widespread influence of 3′-end structures on mammalian mRNA processing and stability. Cell 169, 905–917.e11 (2017).
Linder, J., Koplik, S. E., Kundaje, A. & Seelig, G. Deciphering the impact of genetic variation on human polyadenylation using APARENT2. Genome Biol. 23, 232 (2022).
Yoon, Y., Soles, L. V. & Shi, Y. PAS-seq 2: a fast and sensitive method for global profiling of polyadenylated RNAs. Methods Enzymol. 655, 25–35 (2021).
Nojima, T. et al. Mammalian NET-seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161, 526–540 (2015).
Roth, S. J., Heinz, S. & Benner, C. ARTDeco: automatic readthrough transcription detection. BMC Bioinform. 21, 214 (2020).
Li, W. et al. Systematic profiling of polyA+ transcripts modulated by core 3′ end processing and splicing factors reveals regulatory rules of alternative cleavage and polyadenylation. PLoS Genet. 11, 1–28 (2015).
Sun, Y. et al. Structure of an active human histone pre-mRNA 3′-end processing machinery. Science 367, 700–703 (2020).
Lackford, B. et al. Fip1 regulates mRNA alternative polyadenylation to promote stem cell self-renewal. EMBO J. 33, 878–889 (2014).
Martin, G., Gruber, A. R., Keller, W. & Zavolan, M. Genome-wide znalysis of pre-mRNA 3′ end processing reveals a decisive role of human cleavage factor I in the regulation of 3′ UTR length. Cell Rep. 1, 753–763 (2012).
Ryner, L. C., Takagaki, Y. & Manley, J. L. Sequences downstream of AAUAAA signals affect pre-mRNA cleavage and polyadenylation in vitro both directly and indirectly. Mol. Cell Biol. 9, 1759–1771 (1989).
Richard, P. & Manley, J. L. Transcription termination by nuclear RNA polymerases. Genes Dev. 23, 1247–1269 (2009).
Bentley, D. L. Rules of engagement: co-transcriptional recruitment of pre-mRNA processing factors. Curr. Opin. Cell Biol. 17, 251–256 (2005).
Proudfoot, N. J. Transcriptional termination in mammals: stopping the RNA polymerase II juggernaut. Science 352, aad9926 (2016).
Cui, Y. et al. Elevated pre-mRNA 3′ end processing activity in cancer cells renders vulnerability to inhibition of cleavage and polyadenylation. Nat. Commun. 14, 4480 (2023).
Chen, C.-C. et al. Vitamin B6 addiction in acute myeloid leukemia. Cancer Cell 37, 71–84.e7 (2020).
Abmayr, S. M., Yao, T., Parmely, T. & Workman, J. L. Preparation of nuclear and cytoplasmic extracts from mammalian cells. Curr. Protoc. Mol. Biol. 75, 12.1.1–12.1.10 (2006).
Bogard, N., Linder, J., Rosenberg, A. B. & Seelig, G. A deep neural network for predicting and engineering alternative polyadenylation. Cell 178, 91–106.e23 (2019).
Bushnell, B., Rood, J. & Singer, E. BBMerge—accurate paired shotgun read merging via overlap. PLoS ONE 12, e0185056 (2017).
Zorita, E., Cuscó, P. & Filion, G. J. Starcode: sequence clustering based on all-pairs search. Bioinformatics 31, 1913–1919 (2015).
Reuter, J. S. & Mathews, D. H. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinform. 11, 129 (2010).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. Preprint at http://arxiv.org/abs/1412.6980 (2017).
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A. & Talwalkar, A. Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18, 6765–6816 (2017).
Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).
Wang, X. et al. Herpes simplex virus blocks host transcription termination via the bimodal activities of ICP27. Nat. Commun. 11, 293 (2020).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Wang, X. et al. Mechanism and consequences of herpes simplex virus 1-mediated regulation of host mRNA alternative polyadenylation. PLoS Genet. 17, e1009263 (2021).
Vallat, R. Pingouin: statistics in Python. J. Open Source Softw. 3, 1026 (2018).
Acknowledgements
We thank R. Beckwith for providing compound 2 and B. Semler and J. H. C. Nguyen for providing the suspension HeLa cell line. We wish to acknowledge the support of the Chao Family Comprehensive Cancer Center Shared Resource Genomics High-Throughput Facility, supported by the National Cancer Institute of the National Institutes of Health (NIH) under award no. P30CA062203. The present study was supported by the following grants: NIH (grant nos. R01AI166703, R01GM090056 and R35GM149294 to Y.S and R21HG010945 to G.S). L.L. is supported by the Center for Virus Research Graduate Fellowship provided by the University of California, Irvine Graduate Division. A.M.Y. is a Washington Research Foundation Postdoctoral Fellow. R.C.S. is a Pew Biomedical Scholar.
Author information
Authors and Affiliations
Contributions
L.L., A.M.Y., G.S. and Y.S. conceived and designed the project. L.L. and X.W. performed all the biochemical and sequencing experiments with help from L.V.S., Y.Y., K.S.K.S. and M.C.V. A.M.Y. performed all the machine learning experiments and data analysis. X.T., Y.C., J.L., W.E., R.S., Z.Y., I.M., F.Q., W.L. and Y.S. performed all the other bioinformatic analysis. L.L., A.M.Y., G.S. and Y.S. wrote the paper with input from all authors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Structural & Molecular Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Sara Osman, in collaboration with the Nature Structural & Molecular Biology team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Compound 2 does not affect 3′ processing complex assembly on resistant RNA.
Extended Data Fig. 2 In vitro cleavage for L3 and SVL chimeras.
Extended Data Fig. 3 Development and validation of MPIVA.
(a-b) In vitro cleavage on (a) L3 (UA to CA mutant) and (b) SVL (CA to UA mutant) and their IC50. Compound 2 concentration used is the same as Figs. 1~2. (c) A density plot for the resistance scores of all variants in SVL-N23 library. The low, medium, and high groups represent the screens in the presence of 0.5, 2.5, and 12.5 μM Compound 2. (d) A scatter plot comparing the cleavage efficiency log(frequency in Library 2/frequency in Library 1) and the resistance score (log(frequency in Library 5/frequency in Library 2)) of SVL-CS variants. Pearson correlation is shown. (e-f) In vitro cleavage validation experiment of 4 more RNA (2 sensitive and 2 resistant) from both (e) L3-N23 and (f) SVL-N23 libraries. The CS region sequence and their IC50 is shown. (a, b) n=2 biological replicates and both measurements are shown as dots.
Extended Data Fig. 4 6-mer motif analyses and C3PO learned sequence features from MPIVA.
Counts of 6-mers from (a) L3 and (b) SVL backbones are plotted alongside the nucleotide content of significantly enriched 6-mers in the top sensitive (left logo) and resistant (bottom logo) 10,000 CS variants. Sequence logos use DNA-encoding of RNA nucleotides. Top 10,000 resistant and sensitive sequences were converted into their 6-mer counts. 6-mers in the top 10,000 resistant and sensitive sequences were found to be significant by a binomial test with a null hypothesis of probability of success = 0.256 and alternative hypothesis of > 0.256. p-value threshold was adjusted by the number of possible k-mers, 46, and thus significant 6-mers must have p-values ≤ 0.05/46. The nucleotide content of significant resistant and sensitive 6-mers are shown next to their respective axes. (c) C3PO’s layer 1 filters’ max activation sequence consensus and correlations with 12.5 μM Compound 2 sensitivity predictions. Related to Fig. 4d. Convolutional layers 1 and 2 were analyzed similarly to a previously published analysis of a CNN that predicts alternative polyadenylation (APARENT). In brief, every filter in both convolutional layers were correlated with predictions of drug sensitivity at the 12.5 μM dose. The top 5,000 input sequences from the training set that achieved maximal filter activation were put into a position weight matrix and used to generate position-aware consensus sequence logos. Pearson’s r plots of each filter’s activations with predicted 12.5 μM Compound 2 sensitivity at each position are plotted below these filter-specific sequence logos. Layer 1 filters are 8 positions wide, and layer 2 filters are 15 positions wide. Note that the convolutional layers in C3PO contain even zero-padding to maintain an input/output size of 25. The padding should be accounted for when analyzing the filters’ Pearson r plots. For example in layer 1, the RNA sequences are padded with 4 0′s on both the left and right, and the first position in the correlation plots corresponds to 3 0′s and 5 nts of the randomized region. (d) C3PO’s layer 2 filters’ max activation sequence consensus and correlations with 12.5 μM Compound 2 sensitivity predictions. Related to Fig. 4e.
Extended Data Fig. 5 ΔG of minimum free energy structures and base pairing probabilities of the top 10,000 resistant and sensitive sequences.
Comparison of minimum free energy (MFE) structures’ of ΔG’s from the top 10,000 resistant and sensitive (a) L3 and (b) SVL sequences. For all boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, and whiskers with maximum 1.5 interquartile range. The ΔG’s are significant with a p-value of < 2.2 × 10−308 for L3 and 1.58 × 10−28 for SVL (two-sided t-test with unequal variance). (c) Heatmap of the difference between top 10,000 resistant and sensitive L3 sequences’ average base pairing probabilities. (d) Same as in panel c but for SVL.
Extended Data Fig. 6 Performance and interpretation of machine learning models trained exclusively on one MPIVA RNA sequence context.
(a) Scatter plots of L3-only model performance on predicting drug sensitivity at 3 Compound 2 doses on L3 test sequences (upper) and SVL test sequences (lower). Test sequences include equal number of sequences derived from both the L3 and SVL RNA contexts. (b) Plot of average of all L3-only model’s layer 1 filters’ absolute value of Pearson correlation with 12.5 μM Compound 2 predictions across all positions. These are split into Pearson correlation values associated with resistant, negative, and all 12.5 μM Compound 2 predictions. Dashed gray lines indicate positions at the edge of sequence padding. (c) L3-only model’s convolutional layer 1 max filter activations with the highest Pearson correlation with 12.5 μM Compound 2 predictions. Sequence logos are plotted on top of per-position absolute value of Pearson correlations with 12.5 μM Compound 2 sensitivity predictions. Filters’ Pearson correlations that begin at the canonical cut site in the SVL context are marked, and note that preceding filters may overlap with the designed canonical cut sites. (d) Same analyses as in panel a, but for the SVL-only model. (e) Same analyses as in panel b, but for the SVL-only model. (f) Same analyses as in panel c, but for the SVL-only model.
Extended Data Fig. 7 Cleavage position in MPIVA and effects of YA identity.
(a) Fraction of cleavage position usage across all 4 MPIVA datasets. Position 0 demarcates expected cleavage position. (b) Same as plotted in panel a, but only showing positions +8 and greater. Comparison of Y identity on (c) 12.5 μM Compound 2 resistance in the MPIVA datasets (p-value L3, SVL = 5.74 × 10−63, 1.42 × 10−120), (d) in vitro cleaved logodds (p-value L3, SVL = 2.38 × 10−46, 5.32 × 10−222), and (e) APARENT2-predicted cleaved logodds in the MPIVA datasets (p-value L3, SVL = < 2.2 × 10−308, < 2.2 × 10−308). Two-sided t-tests with unequal variance were used for all statistical tests. For all boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, and whiskers with maximum 1.5 interquartile range. Scatter plots of (f) 12.5 μM Compound 2 resistance, (g) in vitro cleaved logodds, and (h) APARENT2-predicted cleaved logodds of pairs of sequences that share the same sequences upstream of the YA dinucleotide in the MPIVA datasets.
Extended Data Fig. 8 Comparison of top 1000 sensitive and resistant PASs.
1,000 genes with the most (sensitive) and least significant (resistant) readthrough after JTE-607 treatment were identified based on our 4sU-seq data and the resistance scores of their PAS were predicted by C3PO and compared. Two-sided t-test: p = 0.0063.
Extended Data Fig. 9 Comparison of JTE-607 treatment with Ube3d knockdown.
(a) Poly(A) site usage index (PDUI) in control and Ube3d knockdown cells. Blue dot: genes with significant 3′ UTR lengthening; Red dots: genes with significant 3′ UTR shortening. (b) A Venn diagram comparing the genes with significant APA changes in Ube3d knockdown and JTE-607-treated cells.
Extended Data Fig. 10 Conservation and gene ontology analyses for JTE-607-sensitive and –resistant PASs.
(a) The phyloP sequence conservation score for both resistant and sensitive PASs across different species was calculated and plotted against nucleotide position of the CS. Position 0 is the YA (Y is U or C) cleavage position. (b) Gene ontology analyses of genes that contain the top 1000 sensitive or resistant PAS in HepG2 cells. This analysis was done with gProfiler and the top 6 categories that contain between 5 and 500 genes are listed.
Supplementary information
Source data
Source Data Fig. 1
Statistical source data.
Source Data Fig. 1
Unprocessed gel images.
Source Data Fig. 2
Statistical source data.
Source Data Fig. 2
Unprocessed gel images.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 3
Unprocessed gel images.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 1
Unprocessed gel images.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 2
Unprocessed gel images.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 3
Unprocessed gel images.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 7
Statistical source data.
Source Data Extended Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 9
Statistical source data.
Source Data Extended Data Fig. 10
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, L., Yu, A.M., Wang, X. et al. The anticancer compound JTE-607 reveals hidden sequence specificity of the mRNA 3′ processing machinery. Nat Struct Mol Biol 30, 1947–1957 (2023). https://doi.org/10.1038/s41594-023-01161-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41594-023-01161-x
This article is cited by
-
Cleavage and polyadenylation machinery as a novel targetable vulnerability for human cancer
Cancer Gene Therapy (2024)