SULT1A1-dependent sulfonation of alkylators is a lineage-dependent vulnerability of liver cancers

Adult liver malignancies, including intrahepatic cholangiocarcinoma and hepatocellular carcinoma, are the second leading cause of cancer-related deaths worldwide. Most individuals are treated with either combination chemotherapy or immunotherapy, respectively, without specific biomarkers for selection. Here using high-throughput screens, proteomics and in vitro resistance models, we identify the small molecule YC-1 as selectively active against a defined subset of cell lines derived from both liver cancer types. We demonstrate that selectivity is determined by expression of the liver-resident cytosolic sulfotransferase enzyme SULT1A1, which sulfonates YC-1. Sulfonation stimulates covalent binding of YC-1 to lysine residues in protein targets, enriching for RNA-binding factors. Computational analysis defined a wider group of structurally related SULT1A1-activated small molecules with distinct target profiles, which together constitute an untapped small-molecule class. These studies provide a foundation for preclinical development of these agents and point to the broader potential of exploiting SULT1A1 activity for selective targeting strategies. Shi et al. report a sulfonation-dependent vulnerability of liver tumors expressing the sulfotransferase SULT1A1 by showing their sensitivity to the small molecule YC-1 and identifying structurally related compounds that can be modified by SULT1A1.

Adult liver malignancies, including i nt ra he patic c ho la ng io ca rcinoma and hepatocellular carcinoma, are the second leading cause of cancer-related deaths worldwide. Most individuals are treated with either combination chemotherapy or immunotherapy, respectively, without specific biomarkers for selection. Here using high-throughput screens, proteomics and in vitro resistance models, we identify the small molecule YC-1 as selectively active against a defined subset of cell lines derived from both liver cancer types. We demonstrate that selectivity is determined by expression of the liver-resident cytosolic sulfotransferase enzyme SULT1A1, which sulfonates YC-1. Sulfonation s ti mu la tes covalent binding of YC-1 to lysine residues in protein targets, enriching for RNA-binding factors. Computational analysis defined a wider group of structurally related SULT1A1-activated small molecules with distinct target profiles, which together constitute an untapped small-molecule class. These studies provide a foundation for preclinical development of these agents and point to the broader potential of exploiting SULT1A1 activity for selective targeting strategies.
Liver cancer is one of the greatest challenges in oncology, with an annual worldwide burden of >800,000 new diagnoses and >700,000 deaths and an incidence rate that has been rising for several decades 1,2 . The main types of primary adult liver malignancy are intrahepatic cholangiocarcinomas (ICCs) and hepatocellular carcinomas (HCCs), classified by morphological and molecular similarity to bile duct cells and hepatocytes, respectively. The standard treatments in the advanced setting are combination chemotherapy for Article https://doi.org/10.1038/s43018-023-00523-0 YC-1 lacks an established mechanism of action mediating its cytotoxicity, although it has been reported to function as an inhibitor of hypoxia-inducible factor 1-α (HIF1α) 15 and, at high concentrations (>50 μM), an agonist of soluble guanylyl cyclase (sGC) 16 . However, we failed to observe specific activity against IDHm ICC cells by multiple selective HIF1α inhibitors or sGC agonists tested in the screen or in subsequent studies using a larger cell line panel (Extended Data Fig. 2a). Moreover, CRISPR screens indicated that HIF1A and HIF2A are dispensable for the growth of IDHm ICC cells in vitro (Extended Data Fig. 2b). Therefore, we concluded that YC-1 decreases cell viability through a distinct mechanism.
We defined the profile of YC-1 activity across an expanded set of biliary cell lines representing diverse genomic features and biology (Fig. 1a, middle, and Supplementary Table 2; n = 26 cell lines, including ICC, extrahepatic cholangiocarcinoma (ECC), gallbladder carcinoma and the immortalized bile duct line MMNK1). Calculation of the half-maximal growth inhibitory concentration for each cell line (IC 50 ; Fig. 1c,d) revealed a >130,000-fold range of sensitivity (4.77 nM to >631 μM). Each IDHm ICC cell line tested (n = 5) ranked as highly sensitive. However, several WT IDH1 cell lines showed comparable responsiveness, prompting us to consider determinants for YC-1 sensitivity beyond IDH status.
To further define contexts for YC-1 sensitivity, we tested this compound against a panel of 1,022 cancer cell lines derived from >25 tumor types, which we have profiled extensively as part of our Genomics of Drug Sensitivity program 17,18 (Fig. 1a, right). In total, we identified 101 YC-1-responsive cell lines across cancer types (9.7%; Supplementary Table 3 and Methods). There were broad trends in response, with particular enrichment of sensitivity in primary liver cancers (ICC and HCC) and bone and pleural tumors, whereas prostate, stomach, skin and esophageal cancer cell lines (among others) were largely resistant (Fig. 1e). Multiple ICC cell lines, including those with IDH mutations and other genotypes (FGFR2 fusion and BAP1 inactivation), ranked among the most sensitive (Fig. 1f). Cell lines derived from other anatomical subtypes of biliary cancer (ECC and gallbladder carcinoma) were not highly responsive to YC-1 (Fig. 1e). Thus, YC-1 responsiveness varies widely among human cancer cell lines, with enriched sensitivity in both major liver malignancies.

SULT1A1 expression confers YC-1 sensitivity
To study the basis for YC-1 selectivity, we developed acquired resistance models by subjecting RBE cells to gradually increasing concentrations of YC-1 ( Fig. 2a and Methods). Six YC-1-resistant clones were isolated, and each was insensitive at concentrations greater than 25 μM compared to an IC 50 of 0.426 μM for parental RBE cells (Fig. 2b). The resistant phenotype was stable after culturing without YC-1 and then rechallenging with drug. We used tandem mass tag (TMT) labeling-based quantitative mass spectrometry (MS) to identify proteome changes associated with acquired resistance to YC-1 (Fig. 2c, top). Compared to parental RBE cells, all six resistant lines showed striking changes in levels of a single protein among 9,895 proteins detected, specifically, depletion of the cytosolic sulfotransferase enzyme SULT1A1 ( Fig. 2d and Supplementary Table 4). The related sulfotransferase, SULT1A4, exhibited a similar, although less pronounced, trend in depletion. Immunoblotting confirmed marked deletion of SULT1A1 in resistant clones (Fig. 2e).
To further explore the association between SULT1A1 levels and YC-1 response, we analyzed a panel of biliary cancer cell lines (n = 37) with multiplexed quantitative proteomics and calculated differential protein expression and significance between YC-1-sensitive and YC-1-insensitive groups (IC 50 = 0.04-2.14 μM and IC 50 > 3.00 μM, respectively; Fig. 2c,f and Supplementary Table 5). Reduced expression of SULT1A1 was again the top outlier across this heterogeneous set of cell lines. Immunoblotting showed nearly binary differences in expression of SULT1A1 in YC-1-sensitive versus YC-1-insensitive groups (Fig. 2g). ICC 3 and combined immunotherapy/multikinase inhibition for HCC 4 . While response rates and overall survival have improved, outcomes remain poor, and no molecular stratification is used to guide first-line treatment decisions.
The identification of genomic alterations across different subsets of individuals with liver cancer has led to the recent exploration of precision medicine strategies. In ICC, targeted therapies against isocitrate dehydrogenase 1 (IDH1) mutations, fibroblast growth factor receptor 2 (FGFR2) fusions and BRAF mutations show benefit 3 . However, response rates remain relatively low, and disease progression inevitably occurs. Moreover, greater than half of individuals with ICC lack presently actionable mutations. Likewise, a subset of individuals with HCC have genomic alterations suggesting response to targeted therapies, although it is not clear whether these approaches represent improvements over standard of care 4 . Thus, complementary exploration of combination or alternative treatment modalities is warranted.
While ICC and HCC have different genetic and clinicopathological features, there may be opportunities to harness overlaps in biology relating to liver cell lineage states. In particular, the presence of mixed histological subtypes and the expression of common lineage markers suggest that liver tumors may comprise a continuous spectrum between hepatocyte-like and bile duct-like phenotypes [5][6][7][8][9] , observations consistent with the capacity of hepatocytes and bile duct cells to transdifferentiate via bipotential intermediates 10,11 .
In this Article, by conducting high-throughput pharmacologic screens, functional studies and proteomics analyses, we defined synthetic lethal interactions with the small molecule 3-(5′-hydroxymethyl-2′-furyl)-1-benzylindazole (YC-1; lificiguat) in specific liver cancer subsets. We showed that YC-1 is metabolically activated by the hepatocyte lineage cytosolic phenol sulfotransferase SULT1A1, which is highly expressed in a substantial subset of HCCs and in ICCs with dual hepatocyte/bile duct features. SULT1A1 converts YC-1 to a strong alkylator with a target profile enriched for RNA-binding proteins. Subsequent pharmacogenomic analysis, secondary screens and molecular modeling revealed a broader class of SULT1A1-dependent anticancer compounds with a common chemical motif. These studies suggest opportunities to harness this class of activatable alkylators against SULT1A1 + liver cancers of different genotypes.

YC-1 is selectively active against liver cancer cells
We initially sought to identify synthetic lethal therapeutic interactions in IDH1-mutant (IDHm) ICC by conducting a screen on two IDHm (RBE and SNU1079) and two isocitrate dehydrogenase wild type (IDH WT; CCLP1 and HUCCT1) ICC cell lines against the National Center for Advancing Translational Sciences (NCATS) Mechanism Interrogation Plate (MIPE) 12 , consisting of 1,912 oncology-focused compounds, including those with a predicted mode of action and those without established targets ( Fig. 1a and Supplementary Table 1). We also screened three of these lines against the NCATS Pharmacologically Active Chemical Toolbox (NPACT) 13 and kinase-targeting libraries, totaling an additional 6,076 annotated clinical and preclinical compounds. Comparing ranked differential sensitivity scores (area under the curve (AUC)) between IDHm and IDH WT groups, we identified 36 compounds (1.9% of the MIPE library) that were selectively active against IDHm ICC cells ( Fig. 1b; Z < -1.65, P < 0.05), of which 14 (39%) had well-defined mechanisms of action. The most significant outliers against IDHm ICC were SRC family kinase (SFK) inhibitors and YC-1 (Fig. 1b). We previously reported characterization of the sensitivity of IDHm ICC to SFK inhibitors based on a prior screen 14 and hence focused on YC-1 for further analysis. Scaled up experiments demonstrated that YC-1 selectively induced apoptosis, marked by activation of p53 and caspase 3/caspase 7, which was preceded by cell cycle arrest at the G1/S phase transition (Extended Data Fig. 1a-g).  IDHm   ICC21  ICC137  ICC20  ICC7  ICC5  RBE  SNU1079  ICC19  YSCCC  ICC4  SG231  ICC6  ICC8   CCSW1  TKKK  CCLP1  KMCH1  MMNK1  GBC1  HuCCT1  ICC12  HKGZCC  ECC3  ICC2  SSP25  Huh28 -log 10  Furthermore, a strong linear correlation was observed between YC-1 sensitivity (IC 50 ) and SULT1A1 protein expression after log 10 normalization (Fig. 2h). Examination of SULT1A1 mRNA expression indicated that the differences in protein expression were due to transcriptional regulation (Extended Data Fig. 3a). Human SULT1A1 is a cytosolic phenol sulfotransferase that participates in xenobiotic metabolism and hormonal regulation 19 . We used CRISPR-Cas9-mediated knockout to test the functional role of SULT1A1 in the response to YC-1 (Fig. 3a). SULT1A1 knockout in SNU1079 cells with six distinct short guide RNAs (sgRNAs) caused marked resistance to YC-1 (>100-fold increase in IC 50 ) relative to parental cells and cells transduced to express sgRNA against green fluorescent protein (GFP; Fig. 3b,c), whereas the response to the SRC/SFK inhibitor dasatinib 14 (Fig. 1b) was unaffected (Fig. 3c). Comparable results were observed in two other YC-1-hypersensitive cell lines, RBE and ICC20 (Extended Data Fig. 3b-e). Expression of CRISPR-resistant SULT1A1 (Extended Data Fig. 3f) restored responsiveness of SULT1A1-knockout cells to YC-1 treatment, confirming specificity (Fig. 3d,e). Conversely, exogenous viral expression of common polymorphic variants of SULT1A1 was sufficient to engender YC-1 sensitivity in all six SULT1A1 low (YC-1-insensitive) cholangiocarcinoma cell lines tested, reducing the IC 50 by 1,000-to 10,000-fold, whereas proliferation of the mouse hepatocyte cell line AML12 was unaffected (Fig. 3f-h and Extended Data Fig. 3g-i). By contrast, overexpression of a distinct sulfotransferase, SULT1A3, resulted in only an approximately tenfold increased sensitivity to YC-1 (Fig. 3g,h and Extended Data Fig. 3h,i). Thus, we conclude that SULT1A1 expression determines sensitivity to YC-1.
Human SULT1A1 is selectively expressed in hepatocytes. In this regard, reexamination of the YC-1 response profiles indicated that YC-1 sensitivity and SULT1A1 levels in ICC cells correlate with   Article https://doi.org/10.1038/s43018-023-00523-0 a distinct protein expression signature. This signature consists of enrichment of hepatocyte markers with concurrent expression of bile duct markers, whereas SULT1A1biliary cancer cell lines lack substantial expression of hepatocyte markers (Fig. 4a,b and Extended Data Fig. 4a). This 'bilineage' signature is associated with specific genomic features (IDH mutation, FGFR2 fusion and BAP1 loss; Fig. 4c). Notably, in human samples, these genomic alterations correlate with the small duct histological subtype of ICC, resembling the cholangioles (canals of Hering), channels at the junction of the hepatocytes and biliary tree and lined serially by cells of either lineage [20][21][22][23][24][25][26] . ICCs lacking these mutations show similarity to the large, mature bile ducts (that is, large duct subtype). YC-1 responsiveness is depicted in relation to these genotypes and to hepatobiliary cancer subtype (HCC, ICC, ECC, gallbladder carcinoma or mixed ICC/HCC) in Fig. 4c-e. Analysis of 23 human-derived xenograft models also showed associations with SULT1A1 protein expression and IDH mutations, FGFR2 fusions and BAP1 mutation (Extended Data Fig. 4b).
Thus, SULT1A1 expression defines YC-1-sensitive cells and is enriched in ICC cells exhibiting a bilineage expression signature (Fig. 4f) and in HCC.

Furfuryl alcohol moiety determines YC-1 toxicity
SULT1A1 uses the cosubstrate 3′-phosphoadenosine-5′-phosphosulfate (PAPS) to transfer a high-energy sulfate to the hydroxy moiety of phenol groups within target molecules (metabolites, xenobiotics and hormones). Sulfonation increases aqueous solubility of xenobiotics and alters the binding properties of hormones (Extended Data Fig. 5a). YC-1 is comprised of a furfuryl alcohol, indazole core and benzyl group (Fig. 5a). The furfuryl alcohol of YC-1 structurally mimics phenol, suggesting that this group may be a substrate for SULT1A1 phenol sulfotransferase activity and that YC-1 sulfonation may underlie its cytotoxicity. In this regard, crystal structures of SULT1A1 with known substrates reveal plasticity within the catalytic site, permitting a range in substrate specificity 27 .      , Immunoblot for SULT1A1 in SNU1079 parental cells or CRISPR-engineered derivatives with control sgGFP or sgSULT1A1 knockout (CSK1-CSK6). c, SNU1079 parental cells or the engineered derivatives were tested for sensitivity to YC-1 (left) or dasatinib (right). SSP25 and CCLP1 are ICC cell lines that are insensitive to both drugs and are shown for reference. Two biologically independent experiments are shown. d, Immunoblot demonstrating restored expression of SULT1A1 using a CRISPR-resistant construct in RBE SULT1A1-knockout cells; EV, empty vector. e, Reexpression of CRISPR-resistant SULT1A1 resensitizes SULT1A1-knockout RBE cells to YC-1. Data show mean measurements from two biologically independent experiments. f, Schematic for ectopic overexpression of SULT1A1 in ICC cells. g, Immunoblot confirming overexpression of SULT1A1 (denoted by an asterisk (*)), corresponding to h. Several common germline variants of SULT1A1 were tested: SULT1A1-1 (V220M, V223M and F247L), SULT1A1-2 (S44N, V164A and V223M) and SULT1A1-3 (V223M). h, Ectopic expression of SULT1A1 sensitizes SSP25 cells to YC-1 but not dasatinib. Two biologically independent experiments are shown. Error bars represent mean ± s.d.; n = 4 biologically independent experiments. SULT1A3 only modestly increases sensitivity. Immunoblots (b, d and g) were from one of the two performed experiments with similar results. Article https://doi.org/10.1038/s43018-023-00523-0 We surveyed structure-activity relationships (SARs) by systematically modifying each structural component of YC-1 (Fig. 5a). A set of 118 analogs was synthesized and screened against two YC-1-sensitive and two YC-1-insensitive ICC cell lines with high and low SULT1A1 expression, respectively. This analysis indicated that the furan group and hydroxymethyl on the furfuryl alcohol were most important for YC-1 selectivity (differential AUC) and efficacy (average AUC relative to the SULT1A1 high group; Fig. 5a, right). Notably, loss of the hydroxy group within the furfuryl alcohol abolished YC-1 activity (273-fold increase in IC 50 ; Fig. 5b and Extended Data Fig. 5b), consistent with the importance of sulfonation of this group. By contrast, several analogs containing modifications to the benzyl group exhibited increased selectivity toward SULT1A1 high cells ( Fig. 5b and Supplementary Table 6). We computationally modeled the interaction of YC-1 with the crystal structures of human SULT1A1 (ref. 27 ; Fig. 5c and Methods). The cosubstrate PAPS (represented by the non-sulfated form PAP in the crystal structure) is coordinated at one side of the catalytic pocket. YC-1 fits opposingly on the other side in a branched conformation with its hydroxy pointing toward the high-energy sulfate from PAPS. Molecular interactions specifically coordinating SULT1A1 and YC-1 include a cation-π interaction from His 108 to the furan, π-π stacking from Phe 84 to the benzyl and a hydrogen bond between Lys 106 and the oxygen of furan (Extended Data Fig. 5c). Thus, structural modeling supports YC-1 as a SULT1A1 substrate. Accordingly, we tested whether SULT1A1 enzymatic activity was required for YC-1 efficacy by using the phenol-mimicking SULT1A1 inhibitor 2,6-dichloro-4-nitrophenol (DCNP-A) 19 and its analog 2,4-dichloro-6-nitrophenol (DCNP-B). YC-1-treated cells were completely rescued by increasing concentrations of DCNP-A, whereas DCNP-B produced a milder rescue (Fig. 5d). Importantly, an in vitro reconstituted enzymatic assay showed that recombinant SULT1A1 sulfonates YC-1 but not its dehydroxylated form (Fig. 5e). Collectively, these data demonstrate the requirement of SULT1A1 sulfotransferase activity for YC-1 efficacy and indicate that the furfuryl alcohol moiety is the direct target of sulfonation (Fig. 5c, bottom).
The highly specific mechanism of YC-1 activation prompted us to identify additional compounds potentially activated by SULT1A1 via computational analysis of pharmacogenomic databases (Methods HuCC T1  Article https://doi.org/10.1038/s43018-023-00523-0 database (NCI-60), which has annotated cytotoxicity of >22,000 compounds against 60 cancer cell lines. Using the CellMiner NCI-60 tool (https://discover.nci.nih.gov/cellminer/), we identified hundreds of compounds whose activity profiles showed high correlation with either SULT1A1 transcript levels or YC-1 sensitivity (designated NSC 728165 in the NCI-60 database). The top ~150 compounds were categorized into groups based on chemical structure (Fig. 5f, Extended Data Fig. 5d and Supplementary Table 7), including analogs of oncrasin-1 (N-benzyl indole carbinol (N-BIC) group), reactivating p53 and inducing tumor apoptosis (RITA) and aminoflavone (anticancer agents whose activity has been predicted or experimentally shown to depend on SULT1A1 (refs. [28][29][30][31] )) as well as sets of molecules not previously linked to SULT1A1, namely Phortress analogs, and two additional groups of compounds. Query of the Broad Institute PRISM data platform 32 representing >4,000 small molecules tested against a panel of 578 cancer cell lines also revealed strong correlations between oncrasin-1, RITA and Phortress sensitivity profiles and both our YC-1 response data and SULT1A1 mRNA expression levels (Fig. 5g,h and Extended Data Fig. 5e). The molecules identified by the CellMiner NCI-60 analysis included 80 related compounds (amino halogenated benzyl alcohol (AHBA) series), of which 66 were highly similar to one another, sharing a core structure of 2-halogenated 4-amino benzyl alcohol, reminiscent of the furfuryl alcohol of YC-1 (Fig. 5f,i and Extended Data Fig. 5d). Testing the AHBA series in our cell line panel (two SULT1A1 + and SULT1A1lines) confirmed selective activity toward SULT1A1 + cells, comparable to that of YC-1 (Extended Data Fig. 5f and Supplementary Table 8). The other group includes compounds containing hydrazone derivatives of benzyl alcohols (hydrazone group; Fig. 5f and Extended Data Fig. 5d). Hydrazones (composed of an aldehyde or ketone capped by hydrazine) are susceptible to acid hydrolysis to expose the aldehyde 33 , which is likely the target of sulfonation. Thus, we demonstrate unexpected, critical roles for SULT1A1 in the activity of previously studied anticancer agents (YC-1 and Phortress), and we identify an additional compound series whose activity correlates with SULT1A1 expression.
N-BIC and RITA have been proposed to be converted to electrophilic alkylators by in situ sulfonation of their hydroxymethyl groups 28,30 . In addition, aminoflavone is thought to be hydroxylated by cytochrome P450 enzymes, enabling its subsequent sulfonation to become an electrophilic alkylator 31 . Examination of each group of SULT1A1-activated agents suggested a common chemical structure of electron-rich benzyl alcohol derivatives. Following sulfonation, the ring structure is presumably converted into a stabilized, electrophilic intermediate that, in turn, acts as an alkylating reagent. Thus, our elucidation of the YC-1 mechanism of action, together with identification of these additional compound groups (N-BIC, RITA, AHBA and hydrazone), defines a new antitumor compound class activatable by SULT1A1 that harbors a core furfuryl or benzyl alcohol structure that is present natively (Fig. 5i) or after metabolic processing 31 .

Sulfonated YC-1 alkylates proteins
SULT1A1 activity can generate alkylators, suggesting that the aforementioned compounds may bind covalently to cellular targets. To explore the mechanism of YC-1 cytotoxicity, we developed YC-1 derivatives based on the SAR data. In particular, we generated affinity tags and click chemistry reagents by conjugating biotin with a PEG linker (YC-1-biotin) or an alkyne/azide, respectively, to the benzyl group (Fig. 6a), which we found to be amenable to modification (Fig. 5a). These compounds maintained SULT1A1-dependent efficacy, with meta-substituted YC-1biotin showing highest selectivity against SULT1A1-expressing cells.
As an inactive control, we also generated a dehydroxylated analog (DH-YC-1-biotin), which is incapable of being sulfonated and lacks efficacy (Fig. 6a).
N-BIC was previously found to covalently bind to proteins in the cytosol 29 , RITA was found to to cross-link DNA and proteins 34 , and aminoflavone and Phortress were found to form DNA adducts 35,36 . Accordingly, we sought to determine whether YC-1 covalently binds intracellular molecules in a SULT1A1-dependent manner. First, we explored potential YC-1-protein adduct formation by dot blot analysis of nucleic acid-free protein extracts from cells treated with YC-1-biotin or DH-YC-1-biotin. Probing blots with streptavidin revealed enriched binding to YC-1-biotin, which was greatly augmented after SULT1A1 overexpression (Extended Data Fig. 6a). Subsequent analysis of cells treated with YC-1 derivatives revealed a temporal increase in covalent binding of YC-1-biotin (Fig. 6b). Immunofluorescence using a streptavidin-FITC probe also showed progressive accumulation of YC-1-biotin in the cytosol and subsequent nuclear intensification, reinforcing the covalent nature of YC-1 binding to protein targets ( Fig. 6c and Methods). Furthermore, YC-1-biotin binding was largely abolished by YC-1 parent competition or DCNP inhibition of SULT1A1 catalytic activity, indicative of protein binding specificity and its dependence on SULT1A1 (Fig. 6d). By contrast, we failed to observe evidence of YC-1-DNA adduct formation in studies in which we either extracted DNA from YC-1-biotin-treated cells and performed DNA dot blots (probed with streptavidin) or extracted DNA from YC-1 parent-treated cells and tested for hydrolyzed nucleic acids via liquid chromatography-mass spectrometry (Methods).
We next sought to identify the amino acid residue(s) in proteins that are conjugated by YC-1. The YC-1-biotin-bound proteome was

Fig. 5 | A furfuryl alcohol moiety determines YC-1 toxicity and defines a class
of SULT1A1-activatable compounds. a,b, One hundred and twenty analogs of YC-1 were generated and screened for activity against two SULT1A1 high cell lines (RBE and SNU1079) and two SULT1A1 low cell lines (CCLP1 and SSP25). a, Schematic of the chemical moieties of YC-1 (left) and summary of SAR data for the YC-1 analogs grouped according to modifications in the indicated chemical groups. The y axis represents shifts in AUC of the specific YC-1 analogs versus parental YC-1 in SULT1A1 high cell lines. The x axis compares the activity of the analogs versus parental YC-1 in terms of differential sensitivity toward SULT1A1 high cell lines relative to SULT1A1 low lines. b, Graph showing the ranked activity of YC-1 analogs (or parent compound) in terms of differential sensitivity toward SULT1A1 high cells versus SULT1A1 low cells (y axis). The color code represents that relative sensitivity of SULT1A1 high cells to each analog. Bubble sizes denote significance (P value). c, Structural modeling analysis showing docking of YC-1 in the SULT1A1 crystal structure (PDB: 3U3M). A schematic of the predicted sulfonation of YC-1 by SULT1A1 is shown on the bottom. d, Treatment of RBE cells with YC-1 in the presence or absence of a potent (DCNP-A) or less potent (DCNP-B) SULT1A1 inhibitor. Two biologically independent experiments are shown. e, In vitro enzymatic assay showing that SULT1A1 modifies YC-1 but not its dehydroxylated analog. YC-1 or dehydroxylated YC-1 were incubated with recombinant SULT1A1 protein in the presence of p-nitrophenylsulfate and 5′-phosphoadenosine-3′-phosphosulfate (for an additional control, YC-1 was incubated in the reaction buffer without SULT1A1). The reaction was monitored by quantifying released p-nitrophenol via measuring UV absorbance at 405 nm. Data shown are mean measurements from one of the two performed experiments with similar results. f, Results of the computational analysis of the NCI-60 database using CellMiner showing compound groups whose activity profiles are highly correlated with that of YC-1 (y axis) and with SULT1A1 mRNA levels (x axis). Bubble size represents the number of compounds within a given group. Article https://doi.org/10.1038/s43018-023-00523-0 isolated by streptavidin bead affinity purification after 1 day (d) of treatment and was then subjected to complete proteolytic digestion. MS revealed strong detection of YC-1-biotin conjugation to lysine residues, followed by serine and asparagine, compared to control DH-YC-1-biotin samples (Extended Data Fig. 6b and Methods). The side chain of each differentially conjugated amino acid residue contains a nucleophilic nitrogen (for example, amine in lysine) or oxygen (for example, hydroxy in serine) that can react and form a covalent bond with the electrophilic intermediate of YC-1 (Extended Data Fig. 6c). Thus, we conclude that sulfonated YC-1 binds cellular proteins, most prominently via covalent linkage with the side chain of lysine residues.
We used a chemoproteomic approach to identify proteins covalently bound by YC-1. Cells were treated with YC-1-biotin or DH-YC-1-biotin for 8 h. Lysates were then subjected to streptavidinbased affinity purification in the presence of YC-1 parent compound or inactive YC-1 followed by TMT proteomics. Of 250 proteins detected by YC-1-biotin affinity pulldown, 51 were specifically bound compared to inactive DH-YC-1-biotin and were diminished after YC-1 parent competition (Fig. 6e, log 2 (fold change (FC)) > 1, and Supplementary Table 9). Gene Ontology analysis demonstrated strong enrichment of RNA-binding proteins (28/51, odds ratio = 8.07), including mediators of RNA metabolism, splicing and translation (Fig. 6e,f, Supplementary  Table 10 and Methods). There was no correlation between gene expression levels and selective YC-1 binding (Fig. 6g). Moreover, many classes of highly expressed genes showed no enrichment in binding, suggesting that YC-1 binding profiles were not indicative of protein abundance 0 0 ∆AUC (SULT1A1 high versus SULT1A1 low cells)

YC-1 parent
Benzyl analogs (49) Benzyl Furan analogs (14) Furan Hydroxyl analogs (21) Hydroxyl Indazole analogs (9) Indazole -20  Fig. 6d and Methods). Interrogation of the InterPro protein domain database revealed specific enrichment of the RNA recognition motif, DEAD/H box and K homology RNA-binding domains (Fig. 6f, top right). Among the most differentially bound proteins (log 2 (FC) = 2.84) was TAR DNA-binding protein (TARDBP or TDP-43), an RNA-binding factor implicated in various aspects of RNA processing. Notably, genes encoding TARDBP and other top-ranked YC-1 target proteins (the RNA-binding factors CNOT1 and DDX42) scored as essential genes in cancer cell lines based on CRISPR screens (Extended Data Fig. 6e, retrieved from https://depmap.org). Immunoblotting of proteins from YC-1-biotin affinity pulldown assays confirmed that TARDBP, CNOT1 and DDX42 and other candidate proteins bound avidly to YC-1-biotin and were competed in a dose-dependent manner by parent YC-1 (Extended Data Fig. 7a). We also further established that YC-1 directly binds TARDBP based on a reverse coimmunoprecipitation experiment (Extended Data Fig. 7b). Cells treated with YC-1-biotin (with or without competition by parental YC-1) or DH-YC-1-biotin were lysed, and TARDBP protein was immunoprecipitated with a validated antibody 37 . We confirmed that streptavidin detected YC-1-biotin in TARDBP immunoprecipitates but not inactive DH-YC-1-biotin and that parent YC-1 competition reduced the YC-1-biotin signal.
Consistent with defects in RNA-processing factors, YC-1-treated cells exhibited alterations in RNA splicing, including marked changes in intron retention, as revealed by RNA-sequencing analysis (Extended Data Fig. 7c). Moreover, functional assays with a TARDBP splicing reporter 38 showed that YC-1 treatment impaired TARDBP-dependent RNA splicing in a SULT1A1-dependent manner (Extended Data Fig. 7d), whereas TARDBP protein levels were not consistently affected by YC-1 treatment. Thus, our data indicate that YC-1 preferentially targets specific classes of RNA-binding proteins, including splicing factors essential for cell viability (Extended Data Fig. 7e).

SULT1A1-dependent activity of alkylator compounds in vivo
We sought proof-of-concept evidence to support the potential of exploiting SULT1A1-dependent alkylators therapeutically. To this end, we tested in vivo drug response in xenografts generated with pairs of isogenic cell lines with or without SULT1A1 expression. SULT1A1 + and SULT1A1 -(control) derivatives of the CCLP1 ICC cell line (Fig. 7a  e, ICC21 liver orthotopic xenografts were used to assess YC-1 efficacy. Mice were treated with YC-1 or vehicle for 14 d as above starting at a tested time point with observable liver mass. Liver and body weight ratio at each end point was used as a surrogate for tumor mass. The dashed line indicates the liver-to-body weight ratio of a healthy mouse liver. Statistical significance is annotated comparing treatment conditions. Data from two independent animals per group are shown. f, ICC21 subcutaneous xenografts were treated with YC-1 or vehicle as described above until the vehicle group reached the end point. Error bars represent mean ± s.e.m.; n = 6 independent animals per group.
Article https://doi.org/10.1038/s43018-023-00523-0 were injected subcutaneously into immunodeficient mice, which were subsequently treated with YC-1 (50 mg per kg (body weight)) or vehicle after tumors reached ~100 mm 3 . Whereas the SULT1A1tumors were insensitive to YC-1 treatment, the SULT1A1 + tumors regressed rapidly, with complete response within 8 d (Fig. 7b,c). To test for durability of benefit, treatment was halted after 14 d, and the mice were monitored for recurrence. There was a dramatic extension in survival despite this brief treatment course; the median survival of mice in the YC-1-treated SULT1A1 + group was 58 d versus <30 d for each of the other groups (44 d after treatment cessation versus <16 d; Fig. 7d). No significant loss of body weight was noted in the treated animals (Extended Data Fig. 8a). YC-1 treatment also reduced the growth of subcutaneous and orthotopic xenografts generated from the SULT1A1 high human-derived ICC21 cell line (Fig. 7e,f; intratumor YC-1 levels are shown in Extended Data Fig. 8b). Moreover, TUNEL staining demonstrated that YC-1 provoked death of tumor cells but not adjacent normal liver (Extended Data Fig. 8c). There was no apparent liver damage assessed by body and liver weight, histology and plasma marker levels (Extended Data Fig. 8d,e). To extend these findings to other members of this class of alkylator compounds, we examined the efficacy of RITA (Fig. 5f,g,i) in an additional xenograft model that endogenously expressed SULT1A1 or had CRISPR-mediated SULT1A1 knockout (CORL105; Extended Data Fig. 9a). As in the case of YC-1, RITA was active against xenograft growth strictly in a SULT1A1-dependent manner (Extended Data Fig. 9b-d).
In normal tissues, SULT1A1 is most highly expressed in the liver, followed by the intestine, lung and adrenal gland, with most other tissues lacking robust expression; moreover single-cell RNA sequencing revealed that hepatocytes are among the highest SULT1A1-expressing cell types across organs (Extended Data Fig. 10a; retrieved from https:// www.proteinatlas.org/). Similarly, samples from individuals with primary HCC exhibited the highest overall expression of SULT1A1 mRNA among >80 cancer types in The Cancer Genome Atlas (TCGA; retrieved from https://www.cbioportal.org/), and hepato-cholangiocarcinoma and ICC ranked third and sixth, respectively (Extended Data Fig. 10b). To extend these data, we first validated the specificity of a SULT1A1 antibody (shown above; Extended Data Fig. 9a-c) and subsequently performed immunohistochemistry in human specimens. Within the hepatobiliary system, SULT1A1 is largely restricted to hepatocytes, with minimal expression in bile duct cells (Extended Data Fig. 10c). Accordingly, we observed distinct profiles of SULT1A1 expression after immunohistochemistry staining of tissue microarrays representing different hepatobiliary malignancies (HCC, n = 63; ICC, n = 118; ECC, n = 19; Fig. 8a  Article https://doi.org/10.1038/s43018-023-00523-0 was specific to the neoplastic cells rather than stromal populations (Fig. 8a,b). Thus, HCC and ICC frequently express high levels of SULT1A1 consistent with their liver lineage origins, highlighting the potential of harnessing SULT1A1-activatable compounds therapeutically.

Discussion
Here, we used drug sensitivity screens, acquired resistance models and quantitative proteomics to identify the mechanism of action and define biomarkers of responsiveness for the small molecule YC-1. We show that YC-1 is potently active in vitro and in vivo against cancer cells expressing the liver lineage SULT1A1 enzyme. The YC-1 prodrug is converted by sulfonation into an electrophile that is selectively reactive with lysine residues in proteins, with enrichment for RNA-binding proteins. Using large-scale drug screening data and basal gene expression profiles of cell lines, we identified a series of other small molecules with common structural features that together represent a class of SULT1A1-dependent anticancer agents. SULT1A1 is highly expressed in a considerable subset of ICCs and HCCs. Among ICC cell lines, SULT1A1 expression correlates with a gene expression signature suggestive of an intermediate differentiation state between the hepatocyte and bile duct lineages, with associated specific genomic alterations (involving the IDH1/IDH2, BAP1 and FGFR2 genes). Correspondingly, human ICC samples with these alterations have been reported to exhibit cholangiolar histology and coexpress hepatic progenitor, hepatocyte and biliary markers [20][21][22][23][24][25][26] . These observations are consistent with the expression of SULT1A1 in normal hepatocytes and the concept that liver cancer types represent a continuum between hepatocyte and biliary phenotypes, in line with the plasticity of these liver lineages 10,11 . HCCs and ICCs carry poor prognosis, often lack actionable mutations and, when present, show only moderate responses to targeted therapies. SULT1A1-activated anticancer drugs may offer a new avenue for treatment opportunities based on the expression of this biomarker.
We show that YC-1 binds selectively to cellular proteins, particularly via covalent linkage to lysine residues. Oncology applications of covalent inhibitors binding to cysteine and lysine have emerged in recent years. Refinement of the YC-1 scaffold may allow the development of SULT1A1-dependent covalent inhibitors with additional selectivity for protein targets. In this regard, we provide evidence that YC-1 has enriched binding to RNA-processing factors and causes aberrant RNA splicing. YC-1 derivatives could serve to expand the landscape of targetable RNA-binding proteins, taking advantage of covalency. TARDBP and DDX42 are among the most enriched YC-1-targeted RNA-binding proteins. Both are essential for cancer cell viability in vitro, are overexpressed in subsets of HCCs compared to normal liver tissue and show a positive correlation between their expression levels and poor prognosis in individuals with HCC 39-41 (retrieved from https:// www.proteinatlas.org). Derivatives of YC-1 could be explored as scaffolds for efforts to target these RNA-binding proteins. Nonetheless, we find that YC-1 binds many RNA-processing proteins, which challenges identification of cell death-inducing events.
Our SAR studies highlighted the role of furfuryl alcohol in YC-1 activity and suggested that modifications of other regions can potentially enhance sulfonation and improve pharmacokinetic properties ( Fig. 5b and Supplementary Table 6). In addition to YC-1, we have identified a broader class of compounds that depend on SULT1A1-mediated sulfonation for their activity against cancer cells. These compounds contain similar chemical moieties that can be sulfonated directly or after simple metabolic conversion to activate their alkylating properties. Outside the region of sulfonation, these compounds differ in overall chemical structure, which confers distinct target binding properties (for example, based on the reported profiles of RITA and N-BIC compared to YC-1; Extended Data Fig. 5d) 29,34 . Using these leads with fragment-based discovery approaches could expand the landscape of targetable proteins via covalent binding.
In summary, we present a set of small molecules active against SULT1A1-expressing tumor cells. Further development of these agents could lead to prodrug approaches to target specific essential proteins in subsets of liver cancers. SULT1A1 expression transcends the genetic landscape and represents a common hepatic lineage marker, covering many liver cancers. Our data on the YC-1-bound proteome suggests the possibility of using these approaches to target RNA-binding factors. The other SULT1A1-activated compounds could provide a broader toolkit of covalent anticancer agents for additional cellular processes. Furthermore, there is an array of other human sulfotransferases (13 reported SULT family enzymes) with differing target specificity and expression patterns across normal tissues and cancer types 19,40,41 . Comparable strategies could be used to identify sets of small molecules that are activated by the distinct SULT family enzymes that are highly expressed in different cancer cells, leading to the development of new classes of anticancer agents.
Limitations of the study include uncertainty of the SULT1A1 expression level required to activate YC-1, which might complicate the use of SULT1A1 as a biomarker. Further investigation is also needed to pinpoint the molecular mechanism of YC-1-induced cell death from the many binding proteins identified. Moreover, because SULT1A1 is expressed in normal liver, intestine and lung, development of YC-1 derivatives with a more specific target spectrum and preferable toxicity profiles is warranted.

Ethics statement
Animal studies adhered to the Massachusetts General Hospital (MGH) Institutional Animal Care and Use Committee-approved protocol 2019N000116. Studies with human specimens were approved by the Office for Human Research Studies at Dana-Farber/Harvard Cancer Center (protocols 19-699, 14-046 and 02-240).

Screening libraries
Primary screening used the MIPE consisting of 1,912 compounds 12 , NCATS NPACT 22 consisting of 5,099 compounds and a kinase inhibitor library (977 compounds; Supplementary Table 1).
Quantitative high-throughput screen CCLP1, HUCCT1, RBE and SNU1079 cells were seeded into 1,536-well white-bottom plates using a Multidrop Combi peristaltic dispenser (Thermo Fisher) at 500 cells per well in 5 μl of medium. Screening was performed as described previously 42 , with cells treated with compound for 72 h and quantified by CellTiter-Glo (Promega) and ViewLux microplate imaging (PerkinElmer). See Supplementary Table 11 for the assay protocol.
Compound activity was determined by plotting concentrationresponse data for each sample and modeling by a four-parameter logistic fit, yielding IC 50 and efficacy (maximal response) values as previously described 42 . Plate reads for each titration point were first normalized relative to positive control (2 mM bortezomib, 0% activity, full inhibition) and dimethylsulfoxide (DMSO)-only wells (basal, 100% activity). In-house informatics tools were used for data normalization and curve fitting. As in prior studies with the quantitative Article https://doi.org/10.1038/s43018-023-00523-0 high-throughput screen, hits ranged widely in potency, and there was variation in the quality of the corresponding concentration-response curves (CRCs; based on efficacy and number of asymptotes). Samples associated with shallow curves or single-point extrapolated concentration responses were assigned as low-confidence actives. Classes -1.1 and -1.2 were highest-confidence complete CRCs (top and bottom asymptotes with efficacies of ≥80% and <80%, respectively). Classes -2.1 and -2.2 were incomplete CRCs (single asymptote with efficacies of ≥80% and <80%, respectively). Class 3 CRCs were active only at the highest concentration or were poorly fit. Class 4 CRCs were inactive (insufficient efficacy or no curve fit). AUC and curve fittings were used for activity comparison and identification of selective agents. High confidence hits were defined based on curve class -1.1, -1.2, -2.1 or -2.2, maximum response of >50% and an IC 50 of <10 μM. Screening information is summarized in Supplementary Table 13.

YC-1 sensitivity profiling across >1,000 cancer cell lines
Authenticated cancer cell lines (1,022) from the Genomics of Drug Sensitivity in Cancer platform 17 were screened with a nine-point twofold dilution series of YC-1 at the Center for Molecular Therapeutics at the MGH. Area under the dose-response curve and median inhibitory concentration were determined as previously described 17 . Cell lines sensitive to YC-1 were defined based on their ranked AUC with Z score < -1.3 and P < 0.10. The fraction of cell lines from each cancer type sensitive to YC-1 was calculated by dividing the number of those sensitive by the total number from that cancer type.

Chemistry and synthesis of YC-1 analogs
A detailed description of the chemical reagents and procedures used for the synthesis of YC-1 analogs and the testing for YC-1-conjugated deoxynucleobases and amino acids can be found in the Chemistry Methods Supplement.

Molecular modeling
The three-dimensional structure of SULT1A1 was obtained from the Protein Data Bank (PDB: 3U3M). The structure is complexed with the non-sulfated form PAP and 3-cyano-7-hydroxycoumarin. Before molecular modeling and docking, the protein structure was prepared using the Molecular Operating Environment (MOE; Chemical Computing Group). Hydrogens were added with standard protonation state. The modeled structure was energy minimized using the QuickPrep module in the MOE program. The active site was defined by the cocrystal ligand 3-cyano-7-hydroxycoumarin with a 4.5-Å pocket extension. YC-1 conformations were generated during MOE docking. Initial docking pose placement used Triangle Matcher and the London dG scoring function. Final pose refinement used Rigid Receptor and the GBVI/ WSA dG scoring function.

Caspase 3/caspase 7 activity
Cells were seeded at 10,000 cells per well in 96-well plates. The next day, 1 μM YC-1 was added. After incubation with YC-1 for 24 h, caspase 3/caspase 7 activity was assessed using a Caspase-Glo 3/7 assay (Promega, G8090) according to the manufacturer's protocol. Data are represented as mean ± s.d. between technical triplicates.

Crystal violet staining
Cells were seeded at 100,000 cells per well in six-well plates. The next day, 1 μM YC-1 was added. At specified time points, medium was aspirated, and cells were washed with PBS, fixed with ice-cold methanol for 20 min and stained with 0.5% crystal violet in 25% methanol for 20 min at room temperature. Cells were then rinsed in tap water.

Flow cytometry analysis
For cell cycle analysis, double thymidine block-synchronized cells were released into S phase ±YC-1 and labeled with 10 μM EdU for 30 min.
Cells were treated with the Click-iT EdU Alexa Fluor 647 flow cytometry assay kit according to the recommended protocols (Thermo Fisher). Data acquisition was performed on a FACS LSRII apparatus equipped with the FACSDiva software (BD Biosciences). Our gating strategy is summarized in Supplementary Fig. 1.

In vitro resistance model
RBE and SNU1079 cells were plated in six replicates. Nine-step concentrations of YC-1 from IC 10 to IC 90 were calculated for the parental cells. These concentrations were used to serially treat cells, and concentrations were raised by one step once cell growth was observed for two passages. RBE cells were adapted after 2-3 months, with six independent YC-1-resistant clones exhibiting insensitivity to two orders of magnitude with greater YC-1 concentrations than the IC 50 of parental RBE cells. SNU1079 cells were refractory to this assay, with no clones growing beyond a three-step increase in YC-1 concentration.

Plasmids and transduction
To generate SULT1A1-knockout cells, sgRNAs were cloned into pLentiCRISPRv2 (Addgene, 52961; see Supplementary Table 12 for the sequences). These plasmids were used to generate virus by transfection of HEK293T cells with pCMV-VSV-G (Addgene) and dCMV-dR8.91 packaging plasmids. Collected virus was filtered through 0.45-μm filters and used to spin-infect target cells with 8 μg ml -1 polybrene (Millipore, TR-1003-G) at 2,250 r.p.m. and 37 °C for 60 min. After 24 h, cells were selected in 2 μg ml -1 puromycin for at least 3 d, and pooled populations were first tested for SULT1A1 knockout via immunoblotting. Human SULT1A1 (variant allele V223M) was cloned from SNU1079 mRNA (forward primer 5′-ATCGAGATCTGCCACCATGGAGCTGATCCAGGACAC-3′ and reverse primer 5′-ATCGCTCGAGTCACAGCTCAGAGCGGAAGC-3′). cDNA was inserted into pMSCV-Blast. Because several SULT1A1 polymorphic variants are common in populations and may confer distinct substrate affinity, we also created constructs with the variants V220M, V223M and F247L and S44N, V164A and V223M. Site-directed mutagenesis (New England BioLabs) was used to create SULT1A1 expression vectors resistant to CRISPRv2 gRNA to reintroduce SULT1A1 into knockout cells while not affecting the amino acid sequence (Supplementary  Table 12). Murine stem cell virus-derived plasmids were used to generate viruses in combination with pCL-ECO and pCMV-VSV-G packaging plasmids. Successfully transduced target cells were selected with 10 μg ml -1 blasticidin for 1 week.

Immunoblotting
Cell lysis, electrophoresis and immunoblotting were performed as described previously 43

Dose-response assays
Responses to drug were assessed by plating cells in 96-well plates. Growth was quantified using an MTT colorimetric assay read at 490 nm. Each assay was performed at least twice, with the exception of the studies in Fig. 2b, in which case, multiple independent YC-1-resistant lines and replicate parental lines were analyzed in a single assay. IC 50 curves were generated from two biological replicates (except for Fig. 2b, using technical replicates) and analyzed with GraphPad Prism 8.

SULT1A1 activity assay
A colorimetric assay for SULT1A1 activity was adapted from Rothman et al. 29 . MES buffer (pH 7.5), p-nitrophenyl sulfate (5 mM), test substrate (YC-1 or YC-1 derivatives) and PAPS (0.02 mM) were added to a 96-well plate. The reaction was initiated via the addition of recombinant SULT1A1 (20 ng μl -1 or 580 nM) and monitored over time at an absorbance of 405 nm.

Computational identification of SULT1A1-activatable compounds
Identification of compounds with similar response profiles to YC-1 and/ or with a correlation between SULT1A1 expression and sensitivity was performed using the NCI-60 database, the PRISM lab at the Broad Institute 32 Table 3), and the output was compounds from each database with Pearson correlation score to YC-1 profiles. For analysis on the DTP NCI-60 database, Cellminer (https:// discover.nci.nih.gov/cellminer/) was queried with an input of YC-1 (NSC 728165) for similar sensitivity patterns to YC-1 and with SULT1A1 mRNA levels for identification of potential SULT1A1-activatable compounds. The top ~150 correlating compounds from these queries were manually curated to identify a putative chemical moiety for SULT1A1 sulfonation and to group by structural features.

Immunofluorescence of intracellular YC-1-biotin
The predicted covalent binding of YC-1 suggested opportunities to track its cellular uptake and localization via immunofluorescence. Briefly, cells were seeded into six-well plates on collagenized glass coverslips. YC-1-biotin-treated cells were washed three times with PBS, fixed at room temperature in 4% paraformaldehyde in PBS for 15 min with light agitation, washed three times with PBS, permeabilized and blocked for 30 min with 1% whole goat serum in 0.1% Tween in PBS (PBS-T). Next, DAPI (Molecular Probes) and streptavidin, Alexa Fluor 488 conjugate (Thermo Fisher, S11223), was added for 30 min in PBS-T with light agitation at room temperature. Cells were washed three times with PBS and mounted with ProLong Gold Antifade reagent (Molecular Probes). A Nikon Eclipse Ti inverted fluorescence microscope with an oil immersion ×60 objective was used for imaging. Linear range of intensity and no thresholding was used for acquired images. Consistent filter settings for DAPI and 488 FITC channels were used for sequential scans.

Quantitative proteomics
Cells were lysed and prepared for tryptic digest as previously described 44 . Peptides (50 μg) were labeled using TMT reagents (Thermo Fisher), combined and fractionated using basic reversed-phase high-performance LC. Fractions were analyzed by reversed-phase LC-MS2/MS3 for 3 h on an Orbitrap Fusion or Lumos. MS3 isolation for quantification used simultaneous precursor selection, as previously described 45 . MS2 spectra were assigned using SEQUEST by searching against the UniProt database on an in-house-built platform. A targetdecoy database-based search was used to filter the false-discovery rate (FDR) of protein identifications of <1% (ref. 46 ). Peptides that matched to more than one protein were assigned to that protein containing the largest number of matched redundant peptide sequences following the law of parsimony. TMT reporter ion intensities were extracted from the MS3 spectra, selecting the most intense ion within a 0.003-m/z window centered at the predicted m/z value for each reporter ion, and spectra were used for quantification if the sum of the S/N values of all reporter ions divided by the number of analyzed channels was ≥20 and the isolation specificity for the precursor ion was ≥0.75. Protein intensities were calculated by summing the TMT reporter ions for all peptides assigned to a protein. Intensities were first normalized using a bridge channel (pooled digest of all analyzed samples in an experiment) relative to the median bridge channel intensity across all proteins. In a second normalization step, protein intensities measured for each sample were normalized by the average of the median protein intensities measured across the samples.
For affinity-enriched proteomics profiling, after washing, beads were resuspended in 50 mM HEPES (pH 8.5), reduced and alkylated. Urea solution (8 M) was added to a final concentration of 1 M. After tryptic digest, one-third of the resulting peptides of each sample were labeled using TMT-10plex reagents. Labeled samples were combined and analyzed in a 3-h reversed-phase LC-MS2/MS3 run on an Orbitrap Lumos.

Testing for YC-1-conjugated deoxynucleobases and amino acids
DNA adducts. We adapted published methods 47 to test whether YC-1 forms DNA adducts. YC-1-treated RBE cells were lysed, and deoxynucleobases were released from DNA by enzymatic cleavage of glycosidic bonds. Samples were analyzed by LC-MS to detect the presence of molecular species with predicted m/z values of YC-1 conjugating to each of the four deoxynucleobases. While the method demonstrates high sensitivity detecting trace amounts of colibactin DNA adduct 47 , we did not observe evidence supporting YC-1 conjugation to DNA. In addition, the extracted DNA from YC-1-biotin-treated RBE cells with SULT1A1 overexpression was subjected to dot blotting for affinity detection of YC-1-biotin DNA adducts using streptavidin-HRP. There was no signal of streptavidin-HRP retained on the DNA-spotted nylon membrane.
Protein adducts. Nucleic acid-free protein extracts were generated from YC-1-biotin-and DH-YC-1-biotin-treated cells, as described above, and were subjected to dot blotting and detection with streptavidin-HRP ( Fig. 6b and Extended Data Fig. 6a). LC-MS was used to detect YC-1-conjugated amino acids as detailed in the Chemistry Methods Supplement.

Computational analysis of the YC-1-binding proteome
Fifty-one significant YC-1-binding proteins were filtered by a binding score of log 2 (FC) (immunoprecipitation/control) > 1. Immunoprecipitation/control denotes the ratio of abundance of each protein pulled down by YC-1-biotin relative to inactive YC-1-biotin treatment. Significant binders were analyzed for Gene Ontology by the gene set enrichment analysis (GSEA) tool EnrichR (https://maayanlab.cloud/ Enrichr/). To generate Fig. 6f (left), the top enriched terms from the Gene Ontology biological process and Gene Ontology molecular function databases were plotted, with bubble size indicating the significance score of -log 10 (FDR) (Supplementary Table 10). The bar graph in Fig. 6f was based on an integrative analysis using the InterPro protein domain database by EnrichR. For comparative analysis between YC-1 binders and most abundantly expressed genes (Extended Data Fig. 6d), enriched terms from Gene Onology biological process and Gene Ontology molecular function databases were derived from EnrichR using the 500 most abundantly expressed genes based on RNA-sequencing data. The odds ratios of the enriched terms among YC-1-binding proteins were compared with the enriched terms derived using the most abundantly expressed genes. The graph in Fig. 6g was generated by Article https://doi.org/10.1038/s43018-023-00523-0 selecting and graphing the most contrasting terms between YC-1 binders and the most abundantly expressed genes.

RNA sequencing
RNA extracted from RBE and SNU1079 cells (treated with YC-1 or vehicle) using a Qiagen RNeasy Plus Mini kit was processed using the TruSeq Stranded mRNA library preparation kit (Illumina). Samples were run on a Nextseq 500 (Illumina). Reads were aligned to the human reference genome hg38 using STAR (v2.5.3a). Transcript levels were quantified using SALMON (v0.9.1). Count data extraction and normalization and comparison were performed using tximport and DESeq2, respectively (Bioconductor). To analyze RNA splicing, BAM output files from RNA-sequencing alignments were sorted and indexed using SAMtools. Insert length was calculated with pe_utils -compute-insert-len. Expression levels (psi) for retained introns and skipped exons were obtained using MISO 48 . Alternative event annotations of hg38 were generated by rnaseqlib. For filtering events, only events with 10 supporting reads for inclusion or exclusion isoforms and 20 supporting reads for all isoforms were used. The mean psi value of all filtered retained introns and skipped exons was used as the event score.

TARDBP splicing assay
We adapted a published TARDBP splicing assay 38 . A plasmid (Addgene, 107859) containing mEGFP fused to mCherry, interrupted by CFTR exon 9 (bound by and skippable with functional TARDBP) was introduced transiently into SULT1A1-overexpressing or control 293T cells. Following YC-1 (200 nM) or DMSO vehicle treatment, single-cell fluorescence images were captured with GFP (488-nm) and red fluorescent protein (RFP; 561-nm) lasers using a confocal Nikon A1R microscope and were analyzed with ImageJ. TARDBP splicing activity was calculated using the normalized ratio of RFP to GFP from over 500 cells using three to five images in triplicate experiments.

Xenograft studies
Immunodeficient mice (NOD-scid Il2rg null (NSG) strain), age 6-10 weeks, were housed in pathogen-free animal facilities. Studies were under protocol 2019N000116 approved by the MGH Institutional Animal Care and Use Committee, whose regulations for maximum tumor size (<2 cm in greatest diameter) were strictly adhered to. CCLP1 cells (2 × 10 6 cells) exogenously expressing SULT1A1 or empty vector control were injected subcutaneously into recipient mice (both sexes). When tumor volume reached ~100 mm 3 , mice were randomly assigned to the YC-1 or vehicle group (five to six mice per group; efforts were made to balance sex). The mice were treated with intraperitoneal injection of YC-1 dissolved in DMSO (50 mg per kg (body weight) daily for 14 d).
Tumor volumes were measured twice per week. When tumors became ulcerated or exceeded 1,000 mm 3 , mice were killed, and tumor samples were collected. For histology, tissue samples were fixed overnight in 10% formalin, embedded in paraffin, sectioned and stained with hematoxylin and eosin.
Orthotopic models were performed using 500,000 ICC21 cells injected into the liver 49 . Sex was not considered for selection of mice but was considered for balancing when grouping. Pilot studies were conducted to define the engraftment and growth kinetics of the orthotopic tumor model, showing that tumors developed by 6 weeks and reached the end point by 8 weeks. Thus, treatment studies were initiated at 6 weeks after injection. Researchers were not blinded during the conduct of the experiments. Both sexes of mice were used and showed similar tumor growth.

Histology and immunostaining
Sample fixation, embedding, sectioning and staining were performed by iHisto as described previously 49 . For antigen unmasking, specimens were heated in a 2100 Antigen Retriever (Aptum Biologics) in antigen unmasking solution (H-3300, Vector Laboratories), rinsed three times with PBS-T, incubated for 10 min with 1% H 2 O 2 at room temperature and washed three times with PBS-T. After blocking (5% goat serum in PBS-T) for 1 h, tissue sections were incubated at 4 °C overnight with anti-SULT1A1 (Thermo Fisher, CF501838, clone OTI1G10) diluted 1:200 in blocking solution. Samples were washed three times for 3 min each in PBS-T and incubated with MACH 2 rabbit HRP-polymer (Biocare Medical, RHRP520) for 1 h at room temperature. Slides were stained for peroxidase for 3 min with the DAB substrate kit (Biocare Medical, DS900), washed with water and counterstained with hematoxylin. TUNEL staining (R&D Systems, 4810-30-K) was performed according to the manufacturer's protocol. Slides were photographed with an Olympus DP74 microscope. SULT1A1 staining intensity was evaluated semiquantitatively in tumor slides by a gastrointestinal cancer pathologist (V.D.) who was blinded to the origin of the tissue. Tissue microarrays (3-mm cores) were constructed from resected human samples (N = 200 individuals). Information on sex and age was not available. These studies were approved by the Institutional Review Board in the Office for Human Research Studies at Dana-Farber/ Harvard Cancer Center under protocols 19-699, 14-046 and 02-240.

Statistics and reproducibility
Data distribution was assumed to be normal, but this was not formally tested. No statistical methods were used to predetermine sample sizes, but our sample sizes are similar to those reported in previous publications 43,50 . Pathology and immunohistochemistry allocations were blind to the gastrointestinal cancer pathologist during semiquantitative outcome assessment. Other data collections and analyses were not performed blinded to the conditions of all experiments. No data were excluded from the analyses, and randomization was limited to the in vivo experiments. Experimental results were reproducible across multiple (two or more) independent biological replicates, shown with two to three replicates.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
The RNA-sequencing dataset assessing YC-1 treatment has been deposited to the Gene Expression Omnibus, available with accession number GSE168791. MS raw data can be accessed through the Mas-sIVE data repository (massive.ucsd.edu) under accession numbers MSV000090805 and MSV000090808. The human pan-cancer data were derived from the TCGA Research Network at http://cancergenome.nih.gov/. The dataset derived from this resource that supports the findings of this study is available at https://ucsc-xena.gitbook.io/ project/cite-us. Article https://doi.org/10.1038/s43018-023-00523-0 Source data are provided with this paper. All other data supporting the findings of this study are available from the corresponding authors on reasonable request. Article https://doi.org/10.1038/s43018-023-00523-0 Extended Data Fig. 7 | YC-1 covalently binds RNA processing factors and influences RNA splicing. a, Immunoblot from streptavidin affinity purification validating YC-1 binding proteins. RBE cells were treated with YC-1 biotin or DH-YC-1 biotin control in the presence of excess non-biotinylated YC-1 or DH-YC-1 as indicated. Left: Expression of candidate YC-1 binding proteins in whole cell lysates. Right: Immunoblot after Streptavidin capture, showing dose-dependent competition by parent YC-1. b, Immunoblot from TARDBP immunoprecipitation validating direct YC-1 binding. RBE cells were treated as in (a). The immunoblots (a, b) were performed two times with similar results. c, Scatter plot of genes with altered intron retention identified from RNA-seq analysis of RBE and SNU1079 cells treated with YC-1 or vehicle for 6 and 16 hours. n = 3 biological replicates per condition. ΔRI is the intron retention score calculated by the SALMON software package. d, Left, a TARDBP splicing efficiency assay assessing SULT1A1 dependent YC-1 impact on TARDBP RNA splicing activity. 293 T cells exogenously expressing SULT1A1 or empty vector were transiently transfected with the reporter module containing plasmid and treated with YC-1 or DMSO vehicle and analyzed by fluorescent confocal microscope with GFP (G) and mCherry (R) laser. Statistical significance annotated between individual conditions (Welch unpaired t-test). n = 3 biologically independent experiments with cells from two independent images per experiment included (>500 cells in total). 'n.s.' denotes not significant. Right, YC-1 sensitivity assay confirming stable SULT1A1 expression. Two biologically independent experiments are shown. e, siRNA targeting TARDBP (left) or DDX42 (right) reduced target protein expression and cell number monitored for 5-6 d post transfection. Error bars in left panel are mean ± s.d. Data shown were from one of the two performed experiments with similar results. Fig. 8 | SULT1A1 determines YC-1 efficacy in vivo. a, SULT1A1-positive and SULT1A1-negative (control) CCLP1 cells were implanted subcutaneously into NSG mice. Once tumors reached ~100 mm 3 , mice were treated with YC-1 (50 mg/kg) or vehicle for 14 d. Mice were then monitored for disease progression in the absence of treatment. Left: Graph of individual serial tumor volumes. These data are presented in the form of mean volumes in Fig. 6b of the main figures. Right: Serial changes in body weight. Error bars are mean ± s.e.m. n = 5-6 independent animals per group. b-e, Study of SULT1A1-high expressing ICC21 xenografts in response to YC-1 treatment. b, YC-1 concentration was assayed with three independent ICC21 xenograft tumor samples with YC-1 or vehicle treatment by mass spectrometry. Dashed line marks the in vitro ICC21 sensitivity to YC-1 treatment (IC50). Error bars are mean ± s.d. n = 3 independent samples per group. c, Tissue sections of ICC21 orthotopic tumors (middle panels) and adjacent normal (left panels) subjected to H&E and TUNEL staining. TUNEL staining was quantified in graph at the right and two independent animals per group are shown. Scale bar, 100 μm. d, Serial changes in body weight (left) were monitored for three weeks for subcutaneous tumor-bearing mice on YC-1 treatment and the liver and body weight ratios (right) were recorded at the euthanization point. Error bars are mean ± s.e.m. n = 5 independent animals per group, two-tailed, unpaired Student's t-test. e, table displaying plasma markers indicative of liver function from vehicle and YC-1 treated mouse plasma samples (P values derived by two-tailed, unpaired Student's t-test).

Extended Data
Article https://doi.org/10.1038/s43018-023-00523-0 Extended Data Fig. 9 | SULT1A1 determines RITA efficacy in vivo. a-d, Study of SULT1A1-dependent sensitivity of CORL105 xenograft model to RITA. CORL105 is an IDH1-R132C mutant lung adenocarcinoma cell line with high endogenous SULT1A1 levels, which has robust growth in vivo. a, Generation of CORL105 derivatives with CRISPR-mediated SULT1A1 KO. Upper, Immunoblot showing loss of SULT1A1 protein expression upon CRISPR-mediated deletion of SULT1A1 (CSK1-3) and robust SULT1A1 detection in parental CORL105 cells and control sgGFP cells. The immunoblot was performed two times with similar results. Lower, demonstration that CORL105 cells are highly sensitive to RITA in a SULT1A1-dependent manner. Two biologically independent experiments are shown. b, Representative immunohistochemistry staining from CORL105 control (sgGFP) and SULT1A1 KO (CSK2) xenografts, showing loss of staining with the SULT1A1 antibody in the SULT1A1 KO model. Serves as validation of SULT1A1 antibody specificity for IHC studies. Similar results were obtained in samples from 2-4 independent animals per group and three groups with independent sgRNA designs targeting SULT1A1 gene. Scale bar, 200 μm. c, Immunoblot confirming SULT1A1 protein loss in xenograft tumors generated from SULT1A1 KO CORL105 cells. The immunoblot was performed a single time, with multiple independent tumors analyzed per condition. d, Mice harboring tumors of 100-150 mm 3 were treated with RITA (100 mg/kg) or vehicle daily with intermitted dosing breaks. Graphs show serial monitoring of group tumor volume (left), individual tumor volume (middle) and body weight (right). Dosing breaks are denoted by grey shading. Error bars are mean ± s.e.m. n = 5-10 independent animals per group.