Main

Liver cancer is one of the greatest challenges in oncology, with an annual worldwide burden of >800,000 new diagnoses and >700,000 deaths and an incidence rate that has been rising for several decades1,2. The main types of primary adult liver malignancy are intrahepatic cholangiocarcinomas (ICCs) and hepatocellular carcinomas (HCCs), classified by morphological and molecular similarity to bile duct cells and hepatocytes, respectively. The standard treatments in the advanced setting are combination chemotherapy for ICC3 and combined immunotherapy/multikinase inhibition for HCC4. While response rates and overall survival have improved, outcomes remain poor, and no molecular stratification is used to guide first-line treatment decisions.

The identification of genomic alterations across different subsets of individuals with liver cancer has led to the recent exploration of precision medicine strategies. In ICC, targeted therapies against isocitrate dehydrogenase 1 (IDH1) mutations, fibroblast growth factor receptor 2 (FGFR2) fusions and BRAF mutations show benefit3. However, response rates remain relatively low, and disease progression inevitably occurs. Moreover, greater than half of individuals with ICC lack presently actionable mutations. Likewise, a subset of individuals with HCC have genomic alterations suggesting response to targeted therapies, although it is not clear whether these approaches represent improvements over standard of care4. Thus, complementary exploration of combination or alternative treatment modalities is warranted.

While ICC and HCC have different genetic and clinicopathological features, there may be opportunities to harness overlaps in biology relating to liver cell lineage states. In particular, the presence of mixed histological subtypes and the expression of common lineage markers suggest that liver tumors may comprise a continuous spectrum between hepatocyte-like and bile duct-like phenotypes5,6,7,8,9, observations consistent with the capacity of hepatocytes and bile duct cells to transdifferentiate via bipotential intermediates10,11.

In this Article, by conducting high-throughput pharmacologic screens, functional studies and proteomics analyses, we defined synthetic lethal interactions with the small molecule 3-(5′-hydroxymethyl-2′-furyl)-1-benzylindazole (YC-1; lificiguat) in specific liver cancer subsets. We showed that YC-1 is metabolically activated by the hepatocyte lineage cytosolic phenol sulfotransferase SULT1A1, which is highly expressed in a substantial subset of HCCs and in ICCs with dual hepatocyte/bile duct features. SULT1A1 converts YC-1 to a strong alkylator with a target profile enriched for RNA-binding proteins. Subsequent pharmacogenomic analysis, secondary screens and molecular modeling revealed a broader class of SULT1A1-dependent anticancer compounds with a common chemical motif. These studies suggest opportunities to harness this class of activatable alkylators against SULT1A1+ liver cancers of different genotypes.

Results

YC-1 is selectively active against liver cancer cells

We initially sought to identify synthetic lethal therapeutic interactions in IDH1-mutant (IDHm) ICC by conducting a screen on two IDHm (RBE and SNU1079) and two isocitrate dehydrogenase wild type (IDH WT; CCLP1 and HUCCT1) ICC cell lines against the National Center for Advancing Translational Sciences (NCATS) Mechanism Interrogation Plate (MIPE)12, consisting of 1,912 oncology-focused compounds, including those with a predicted mode of action and those without established targets (Fig. 1a and Supplementary Table 1). We also screened three of these lines against the NCATS Pharmacologically Active Chemical Toolbox (NPACT)13 and kinase-targeting libraries, totaling an additional 6,076 annotated clinical and preclinical compounds. Comparing ranked differential sensitivity scores (area under the curve (AUC)) between IDHm and IDH WT groups, we identified 36 compounds (1.9% of the MIPE library) that were selectively active against IDHm ICC cells (Fig. 1b; Z < –1.65, P < 0.05), of which 14 (39%) had well-defined mechanisms of action. The most significant outliers against IDHm ICC were SRC family kinase (SFK) inhibitors and YC-1 (Fig. 1b). We previously reported characterization of the sensitivity of IDHm ICC to SFK inhibitors based on a prior screen14 and hence focused on YC-1 for further analysis. Scaled up experiments demonstrated that YC-1 selectively induced apoptosis, marked by activation of p53 and caspase 3/caspase 7, which was preceded by cell cycle arrest at the G1/S phase transition (Extended Data Fig. 1a–g).

Fig. 1: Identification of selective YC-1 activity against liver cancer subsets.
figure 1

a, Schematic of drug screening and validation studies; MoA, mechanism of action. b, Graphs of the results from the small-molecule screen with the MIPE library in IDHm ICC cell lines (SNU1079 and RBE) and WT IDH1 ICC cell lines (HUCCT1 and CCLP1). Top, differential sensitivity (x axis) and significance (y axis; –log (P value)) of compounds toward IDHm versus WT IDH1 lines. Relative sensitivity of the IDHm cells is denoted by size of the bubble. Bottom, ranking of individual compounds according to differential sensitivity. Significance was analyzed using a two-tailed Student’s t-test. P < 0.05 was considered statistically significant. The screen was performed once using a concentration–response profile (stepwise fivefold dilutions of drug between 92.1 µM and 0.006 µM). eGFR, estimated glomerular filtration rate. c, Heat map of YC-1 sensitivity in 25 biliary cancer cell lines and in MMNK1 cells (immortalized bile duct). IDHm cell lines are highlighted. d, IC50 measurements for YC-1 in select IDHm (red) and WT IDH1 (black/gray) ICC cell lines. Two biologically independent experiments are shown. e, Compiled results of YC-1 sensitivity in 1,022 cancer cell lines. The data show the ranked fraction of YC-1-sensitive cell lines in each cancer type. The screen was performed once using a nine-point twofold dilution series of YC-1; IH, XYZ; EH, XYZ; NSCLC, non-small cell lung cancer. IH, intrahepatic; EH, extrahepatic. f, Graph showing that ICC cell lines with IDH1/IDH2, FGFR2 and BAP1 genomic alterations rank among the most sensitive in the screen. ‘YC-1 sensitivity’ (y axis) denotes log10-transformed YC-1 IC50 values (μM). Red dots represent RBE, SNU1079 and ICC5 cells (IDH1R132C, IDH1R132S, and IDH1R132L mutants, respectively), and the pink dot represents YSCCC cells (IDH1R100Q mutant).

Source data

YC-1 lacks an established mechanism of action mediating its cytotoxicity, although it has been reported to function as an inhibitor of hypoxia-inducible factor 1-α (HIF1α)15 and, at high concentrations (>50 μM), an agonist of soluble guanylyl cyclase (sGC)16. However, we failed to observe specific activity against IDHm ICC cells by multiple selective HIF1α inhibitors or sGC agonists tested in the screen or in subsequent studies using a larger cell line panel (Extended Data Fig. 2a). Moreover, CRISPR screens indicated that HIF1A and HIF2A are dispensable for the growth of IDHm ICC cells in vitro (Extended Data Fig. 2b). Therefore, we concluded that YC-1 decreases cell viability through a distinct mechanism.

We defined the profile of YC-1 activity across an expanded set of biliary cell lines representing diverse genomic features and biology (Fig. 1a, middle, and Supplementary Table 2; n = 26 cell lines, including ICC, extrahepatic cholangiocarcinoma (ECC), gallbladder carcinoma and the immortalized bile duct line MMNK1). Calculation of the half-maximal growth inhibitory concentration for each cell line (IC50; Fig. 1c,d) revealed a >130,000-fold range of sensitivity (4.77 nM to >631 μM). Each IDHm ICC cell line tested (n = 5) ranked as highly sensitive. However, several WT IDH1 cell lines showed comparable responsiveness, prompting us to consider determinants for YC-1 sensitivity beyond IDH status.

To further define contexts for YC-1 sensitivity, we tested this compound against a panel of 1,022 cancer cell lines derived from >25 tumor types, which we have profiled extensively as part of our Genomics of Drug Sensitivity program17,18 (Fig. 1a, right). In total, we identified 101 YC-1-responsive cell lines across cancer types (9.7%; Supplementary Table 3 and Methods). There were broad trends in response, with particular enrichment of sensitivity in primary liver cancers (ICC and HCC) and bone and pleural tumors, whereas prostate, stomach, skin and esophageal cancer cell lines (among others) were largely resistant (Fig. 1e). Multiple ICC cell lines, including those with IDH mutations and other genotypes (FGFR2 fusion and BAP1 inactivation), ranked among the most sensitive (Fig. 1f). Cell lines derived from other anatomical subtypes of biliary cancer (ECC and gallbladder carcinoma) were not highly responsive to YC-1 (Fig. 1e). Thus, YC-1 responsiveness varies widely among human cancer cell lines, with enriched sensitivity in both major liver malignancies.

SULT1A1 expression confers YC-1 sensitivity

To study the basis for YC-1 selectivity, we developed acquired resistance models by subjecting RBE cells to gradually increasing concentrations of YC-1 (Fig. 2a and Methods). Six YC-1-resistant clones were isolated, and each was insensitive at concentrations greater than 25 μM compared to an IC50 of 0.426 μM for parental RBE cells (Fig. 2b). The resistant phenotype was stable after culturing without YC-1 and then rechallenging with drug. We used tandem mass tag (TMT) labeling-based quantitative mass spectrometry (MS) to identify proteome changes associated with acquired resistance to YC-1 (Fig. 2c, top). Compared to parental RBE cells, all six resistant lines showed striking changes in levels of a single protein among 9,895 proteins detected, specifically, depletion of the cytosolic sulfotransferase enzyme SULT1A1 (Fig. 2d and Supplementary Table 4). The related sulfotransferase, SULT1A4, exhibited a similar, although less pronounced, trend in depletion. Immunoblotting confirmed marked deletion of SULT1A1 in resistant clones (Fig. 2e).

Fig. 2: YC-1 sensitivity correlates with SULT1A1 expression levels.
figure 2

a, Schematic of acquired YC-1 resistance experiment. b, Sensitivity (IC50) to YC-1 of parental RBE cells and acquired resistance models. IC50 curves (left) and computed values (right) are shown. Asterisks indicate IC50 values too high to calculate. Graphs show means of technical replicates. c, Schematic of TMT proteomics analysis of parental and YC-1-resistant RBE cells (top; corresponds with d) and of a large panel of ICC cell lines (bottom; corresponds with f). d, Volcano plot of proteomics data comparing parental and resistant RBE cell lines, highlighting significant depletion of SULT1A1 in resistant lines (two-tailed, unpaired Student’s t-test). e, Immunoblot validating SULT1A1 loss in resistant cells. Samples are from the same gel and exposure. The cropping removes an irrelevant lane. f, TMT proteomics comparing 5 YC-1-sensitive (IC50 median = 0.256 μM) and 32 YC-1-insensitive (IC50 median = 18.9 μM, not including cell lines with no response) biliary cell lines (two-tailed, unpaired Student’s t-test). g, Immunoblot for SULT1A1 in the indicated cell lines. Each is biliary, with the exception of HepG2 (HCC) and CORL105 (SULT1A1high lung cancer). h, Graph of the correlation between SULT1A1 protein levels and YC-1 IC50 across a set of 19 biliary tract cell lines. The data points in grey showed no response to YC-1. The linear regression line is shown. Immunoblots (e and g) were from one of the two performed experiments with similar results.

Source data

To further explore the association between SULT1A1 levels and YC-1 response, we analyzed a panel of biliary cancer cell lines (n = 37) with multiplexed quantitative proteomics and calculated differential protein expression and significance between YC-1-sensitive and YC-1-insensitive groups (IC50 = 0.04–2.14 μM and IC50 > 3.00 μM, respectively; Fig. 2c,f and Supplementary Table 5). Reduced expression of SULT1A1 was again the top outlier across this heterogeneous set of cell lines. Immunoblotting showed nearly binary differences in expression of SULT1A1 in YC-1-sensitive versus YC-1-insensitive groups (Fig. 2g). Furthermore, a strong linear correlation was observed between YC-1 sensitivity (IC50) and SULT1A1 protein expression after log10 normalization (Fig. 2h). Examination of SULT1A1 mRNA expression indicated that the differences in protein expression were due to transcriptional regulation (Extended Data Fig. 3a).

Human SULT1A1 is a cytosolic phenol sulfotransferase that participates in xenobiotic metabolism and hormonal regulation19. We used CRISPR–Cas9-mediated knockout to test the functional role of SULT1A1 in the response to YC-1 (Fig. 3a). SULT1A1 knockout in SNU1079 cells with six distinct short guide RNAs (sgRNAs) caused marked resistance to YC-1 (>100-fold increase in IC50) relative to parental cells and cells transduced to express sgRNA against green fluorescent protein (GFP; Fig. 3b,c), whereas the response to the SRC/SFK inhibitor dasatinib14 (Fig. 1b) was unaffected (Fig. 3c). Comparable results were observed in two other YC-1-hypersensitive cell lines, RBE and ICC20 (Extended Data Fig. 3b–e). Expression of CRISPR-resistant SULT1A1 (Extended Data Fig. 3f) restored responsiveness of SULT1A1-knockout cells to YC-1 treatment, confirming specificity (Fig. 3d,e). Conversely, exogenous viral expression of common polymorphic variants of SULT1A1 was sufficient to engender YC-1 sensitivity in all six SULT1A1low (YC-1-insensitive) cholangiocarcinoma cell lines tested, reducing the IC50 by 1,000- to 10,000-fold, whereas proliferation of the mouse hepatocyte cell line AML12 was unaffected (Fig. 3f–h and Extended Data Fig. 3g–i). By contrast, overexpression of a distinct sulfotransferase, SULT1A3, resulted in only an approximately tenfold increased sensitivity to YC-1 (Fig. 3g,h and Extended Data Fig. 3h,i). Thus, we conclude that SULT1A1 expression determines sensitivity to YC-1.

Fig. 3: SULT1A1 determines YC-1 sensitivity.
figure 3

a, Schematic for genetic knockout of SULT1A1 in ICC cells. b, Immunoblot for SULT1A1 in SNU1079 parental cells or CRISPR-engineered derivatives with control sgGFP or sgSULT1A1 knockout (CSK1–CSK6). c, SNU1079 parental cells or the engineered derivatives were tested for sensitivity to YC-1 (left) or dasatinib (right). SSP25 and CCLP1 are ICC cell lines that are insensitive to both drugs and are shown for reference. Two biologically independent experiments are shown. d, Immunoblot demonstrating restored expression of SULT1A1 using a CRISPR-resistant construct in RBE SULT1A1-knockout cells; EV, empty vector. e, Reexpression of CRISPR-resistant SULT1A1 resensitizes SULT1A1-knockout RBE cells to YC-1. Data show mean measurements from two biologically independent experiments. f, Schematic for ectopic overexpression of SULT1A1 in ICC cells. g, Immunoblot confirming overexpression of SULT1A1 (denoted by an asterisk (*)), corresponding to h. Several common germline variants of SULT1A1 were tested: SULT1A1-1 (V220M, V223M and F247L), SULT1A1-2 (S44N, V164A and V223M) and SULT1A1-3 (V223M). h, Ectopic expression of SULT1A1 sensitizes SSP25 cells to YC-1 but not dasatinib. Two biologically independent experiments are shown. Error bars represent mean ± s.d.; n = 4 biologically independent experiments. SULT1A3 only modestly increases sensitivity. Immunoblots (b, d and g) were from one of the two performed experiments with similar results.

Source data

Human SULT1A1 is selectively expressed in hepatocytes. In this regard, reexamination of the YC-1 response profiles indicated that YC-1 sensitivity and SULT1A1 levels in ICC cells correlate with a distinct protein expression signature. This signature consists of enrichment of hepatocyte markers with concurrent expression of bile duct markers, whereas SULT1A1 biliary cancer cell lines lack substantial expression of hepatocyte markers (Fig. 4a,b and Extended Data Fig. 4a). This ‘bilineage’ signature is associated with specific genomic features (IDH mutation, FGFR2 fusion and BAP1 loss; Fig. 4c). Notably, in human samples, these genomic alterations correlate with the small duct histological subtype of ICC, resembling the cholangioles (canals of Hering), channels at the junction of the hepatocytes and biliary tree and lined serially by cells of either lineage20,21,22,23,24,25,26. ICCs lacking these mutations show similarity to the large, mature bile ducts (that is, large duct subtype). YC-1 responsiveness is depicted in relation to these genotypes and to hepatobiliary cancer subtype (HCC, ICC, ECC, gallbladder carcinoma or mixed ICC/HCC) in Fig. 4c–e. Analysis of 23 human-derived xenograft models also showed associations with SULT1A1 protein expression and IDH mutations, FGFR2 fusions and BAP1 mutation (Extended Data Fig. 4b). Thus, SULT1A1 expression defines YC-1-sensitive cells and is enriched in ICC cells exhibiting a bilineage expression signature (Fig. 4f) and in HCC.

Fig. 4: SULT1A1 expression is associated with hepatocyte lineage.
figure 4

a,b, GSEA (a) and heat map (b) of hepatocyte protein expression in ICC cell lines according to SULT1A1 protein levels. Significance was calculated as FDR by the GSEA package; q < 0.25 was considered statistically significant; NES, normalized enrichment score. c, Circos plot of 28 biliary tract cell lines depicting YC-1 sensitivity, biliary cancer type and specific molecular features (that is, IDH1 mutation, FGFR2 fusion and absence of BAP1 protein expression). The asterisk (*) indicates mixed ICC/HCC histology; GB, gallbladder carcinoma; ND, normal duct. d, YC-1 sensitivity measurement in representative HCC, ECC and gallbladder carcinoma cell lines. Two biologically independent experiments are shown. e, Scatter plot comparing YC-1 IC50 values of cell lines between liver cancer subtypes. Asterisks (*) indicate cell lines exhibiting no response to YC-1 (IC50 not calculable). f, Model relating SULT1A1 expression, liver lineage marker expression and genomic alterations in ICC.

Source data

Furfuryl alcohol moiety determines YC-1 toxicity

SULT1A1 uses the cosubstrate 3′-phosphoadenosine-5′-phosphosulfate (PAPS) to transfer a high-energy sulfate to the hydroxy moiety of phenol groups within target molecules (metabolites, xenobiotics and hormones). Sulfonation increases aqueous solubility of xenobiotics and alters the binding properties of hormones (Extended Data Fig. 5a). YC-1 is comprised of a furfuryl alcohol, indazole core and benzyl group (Fig. 5a). The furfuryl alcohol of YC-1 structurally mimics phenol, suggesting that this group may be a substrate for SULT1A1 phenol sulfotransferase activity and that YC-1 sulfonation may underlie its cytotoxicity. In this regard, crystal structures of SULT1A1 with known substrates reveal plasticity within the catalytic site, permitting a range in substrate specificity27.

Fig. 5: A furfuryl alcohol moiety determines YC-1 toxicity and defines a class of SULT1A1-activatable compounds.
figure 5

a,b, One hundred and twenty analogs of YC-1 were generated and screened for activity against two SULT1A1high cell lines (RBE and SNU1079) and two SULT1A1low cell lines (CCLP1 and SSP25). a, Schematic of the chemical moieties of YC-1 (left) and summary of SAR data for the YC-1 analogs grouped according to modifications in the indicated chemical groups. The y axis represents shifts in AUC of the specific YC-1 analogs versus parental YC-1 in SULT1A1high cell lines. The x axis compares the activity of the analogs versus parental YC-1 in terms of differential sensitivity toward SULT1A1high cell lines relative to SULT1A1low lines. b, Graph showing the ranked activity of YC-1 analogs (or parent compound) in terms of differential sensitivity toward SULT1A1high cells versus SULT1A1low cells (y axis). The color code represents that relative sensitivity of SULT1A1high cells to each analog. Bubble sizes denote significance (P value). c, Structural modeling analysis showing docking of YC-1 in the SULT1A1 crystal structure (PDB: 3U3M). A schematic of the predicted sulfonation of YC-1 by SULT1A1 is shown on the bottom. d, Treatment of RBE cells with YC-1 in the presence or absence of a potent (DCNP-A) or less potent (DCNP-B) SULT1A1 inhibitor. Two biologically independent experiments are shown. e, In vitro enzymatic assay showing that SULT1A1 modifies YC-1 but not its dehydroxylated analog. YC-1 or dehydroxylated YC-1 were incubated with recombinant SULT1A1 protein in the presence of p-nitrophenylsulfate and 5′-phosphoadenosine-3′-phosphosulfate (for an additional control, YC-1 was incubated in the reaction buffer without SULT1A1). The reaction was monitored by quantifying released p-nitrophenol via measuring UV absorbance at 405 nm. Data shown are mean measurements from one of the two performed experiments with similar results. f, Results of the computational analysis of the NCI-60 database using CellMiner showing compound groups whose activity profiles are highly correlated with that of YC-1 (y axis) and with SULT1A1 mRNA levels (x axis). Bubble size represents the number of compounds within a given group. g, Volcano plot of the computational analysis of the PRISM database showing correlation of sensitivity profiles of compounds with YC-1 profiles. Pearson correlations (ImEffect size on the x axis) were computed between the sensitivity profile of YC-1 (Fig. 1f and Supplementary Table 3) and the DepMap PRISM Drug Sensitivity data. For visualization purposes, only drugs with Pearson correlation values of >0.07 are shown. h, Scatter plot showing the correlation between YC-1 and oncrasin-1 sensitivity profiles across 398 cancer cell lines. Relative SULT1A1 mRNA levels are depicted by the color scheme. i, Chemical structures of representative SULT1A1-activatable compounds. Note the common furfuryl/benzyl alcohol moieties. Significance (b and f) was analyzed by using two-tailed Student’s t-tests. A P value of <0.05 was considered statistically significant.

Source data

We surveyed structure–activity relationships (SARs) by systematically modifying each structural component of YC-1 (Fig. 5a). A set of 118 analogs was synthesized and screened against two YC-1-sensitive and two YC-1-insensitive ICC cell lines with high and low SULT1A1 expression, respectively. This analysis indicated that the furan group and hydroxymethyl on the furfuryl alcohol were most important for YC-1 selectivity (differential AUC) and efficacy (average AUC relative to the SULT1A1high group; Fig. 5a, right). Notably, loss of the hydroxy group within the furfuryl alcohol abolished YC-1 activity (273-fold increase in IC50; Fig. 5b and Extended Data Fig. 5b), consistent with the importance of sulfonation of this group. By contrast, several analogs containing modifications to the benzyl group exhibited increased selectivity toward SULT1A1high cells (Fig. 5b and Supplementary Table 6). We computationally modeled the interaction of YC-1 with the crystal structures of human SULT1A1 (ref. 27; Fig. 5c and Methods). The cosubstrate PAPS (represented by the non-sulfated form PAP in the crystal structure) is coordinated at one side of the catalytic pocket. YC-1 fits opposingly on the other side in a branched conformation with its hydroxy pointing toward the high-energy sulfate from PAPS. Molecular interactions specifically coordinating SULT1A1 and YC-1 include a cation–π interaction from His 108 to the furan, π–π stacking from Phe 84 to the benzyl and a hydrogen bond between Lys 106 and the oxygen of furan (Extended Data Fig. 5c). Thus, structural modeling supports YC-1 as a SULT1A1 substrate. Accordingly, we tested whether SULT1A1 enzymatic activity was required for YC-1 efficacy by using the phenol-mimicking SULT1A1 inhibitor 2,6-dichloro-4-nitrophenol (DCNP-A)19 and its analog 2,4-dichloro-6-nitrophenol (DCNP-B). YC-1-treated cells were completely rescued by increasing concentrations of DCNP-A, whereas DCNP-B produced a milder rescue (Fig. 5d). Importantly, an in vitro reconstituted enzymatic assay showed that recombinant SULT1A1 sulfonates YC-1 but not its dehydroxylated form (Fig. 5e). Collectively, these data demonstrate the requirement of SULT1A1 sulfotransferase activity for YC-1 efficacy and indicate that the furfuryl alcohol moiety is the direct target of sulfonation (Fig. 5c, bottom).

The highly specific mechanism of YC-1 activation prompted us to identify additional compounds potentially activated by SULT1A1 via computational analysis of pharmacogenomic databases (Methods). First, we queried the NCI Developmental Therapeutics Program database (NCI-60), which has annotated cytotoxicity of >22,000 compounds against 60 cancer cell lines. Using the CellMiner NCI-60 tool (https://discover.nci.nih.gov/cellminer/), we identified hundreds of compounds whose activity profiles showed high correlation with either SULT1A1 transcript levels or YC-1 sensitivity (designated NSC 728165 in the NCI-60 database). The top ~150 compounds were categorized into groups based on chemical structure (Fig. 5f, Extended Data Fig. 5d and Supplementary Table 7), including analogs of oncrasin-1 (N-benzyl indole carbinol (N-BIC) group), reactivating p53 and inducing tumor apoptosis (RITA) and aminoflavone (anticancer agents whose activity has been predicted or experimentally shown to depend on SULT1A1 (refs. 28,29,30,31)) as well as sets of molecules not previously linked to SULT1A1, namely Phortress analogs, and two additional groups of compounds. Query of the Broad Institute PRISM data platform32 representing >4,000 small molecules tested against a panel of 578 cancer cell lines also revealed strong correlations between oncrasin-1, RITA and Phortress sensitivity profiles and both our YC-1 response data and SULT1A1 mRNA expression levels (Fig. 5g,h and Extended Data Fig. 5e).

The molecules identified by the CellMiner NCI-60 analysis included 80 related compounds (amino halogenated benzyl alcohol (AHBA) series), of which 66 were highly similar to one another, sharing a core structure of 2-halogenated 4-amino benzyl alcohol, reminiscent of the furfuryl alcohol of YC-1 (Fig. 5f,i and Extended Data Fig. 5d). Testing the AHBA series in our cell line panel (two SULT1A1+ and SULT1A1 lines) confirmed selective activity toward SULT1A1+ cells, comparable to that of YC-1 (Extended Data Fig. 5f and Supplementary Table 8). The other group includes compounds containing hydrazone derivatives of benzyl alcohols (hydrazone group; Fig. 5f and Extended Data Fig. 5d). Hydrazones (composed of an aldehyde or ketone capped by hydrazine) are susceptible to acid hydrolysis to expose the aldehyde33, which is likely the target of sulfonation. Thus, we demonstrate unexpected, critical roles for SULT1A1 in the activity of previously studied anticancer agents (YC-1 and Phortress), and we identify an additional compound series whose activity correlates with SULT1A1 expression.

N-BIC and RITA have been proposed to be converted to electrophilic alkylators by in situ sulfonation of their hydroxymethyl groups28,30. In addition, aminoflavone is thought to be hydroxylated by cytochrome P450 enzymes, enabling its subsequent sulfonation to become an electrophilic alkylator31. Examination of each group of SULT1A1-activated agents suggested a common chemical structure of electron-rich benzyl alcohol derivatives. Following sulfonation, the ring structure is presumably converted into a stabilized, electrophilic intermediate that, in turn, acts as an alkylating reagent. Thus, our elucidation of the YC-1 mechanism of action, together with identification of these additional compound groups (N-BIC, RITA, AHBA and hydrazone), defines a new antitumor compound class activatable by SULT1A1 that harbors a core furfuryl or benzyl alcohol structure that is present natively (Fig. 5i) or after metabolic processing31.

Sulfonated YC-1 alkylates proteins

SULT1A1 activity can generate alkylators, suggesting that the aforementioned compounds may bind covalently to cellular targets. To explore the mechanism of YC-1 cytotoxicity, we developed YC-1 derivatives based on the SAR data. In particular, we generated affinity tags and click chemistry reagents by conjugating biotin with a PEG linker (YC-1–biotin) or an alkyne/azide, respectively, to the benzyl group (Fig. 6a), which we found to be amenable to modification (Fig. 5a). These compounds maintained SULT1A1-dependent efficacy, with meta-substituted YC-1–biotin showing highest selectivity against SULT1A1-expressing cells. As an inactive control, we also generated a dehydroxylated analog (DH-YC-1–biotin), which is incapable of being sulfonated and lacks efficacy (Fig. 6a).

Fig. 6: Proteomic identification of YC-1 binding targets.
figure 6

a, Activity of YC-1–biotin and DH-YC-1–biotin against parental RBE cells and derivative lines with SULT1A1 knockout (CSK2) and SULT1A1 knockout with SULT1A1 reexpression (CSK2 R4). Two biologically independent experiments are shown. b, Dot blot (left) and western blot (right) of protein lysates from RBE cells treated with YC-1–biotin or DH-YC-1–biotin for the indicated times. Blots were probed with HRP-conjugated streptavidin. Ponceau S staining for dot blots and β-actin for western blots served as the total protein loading controls. c, Immunofluorescence images of RBE cells treated with YC-1–biotin. Fixed cells were stained with streptavidin–FITC to detect YC-1–biotin and with DAPI for visualization of the nucleus; scale bar, 17 µm. d, Dot blot of protein lysates from RBE cells treated as indicated for specificity and SULT1A1 dependency. e, Scatter plot of the results of the YC-1 pulldown. Enrichment is revealed by binding to YC-1–biotin relative to DH-YC-1–biotin control (x axis) and YC-1–biotin binding competed by parent YC-1 (y axis). Proteins with specific RNA-binding domains are color coded; IP, immunoprecipitation; RRM, RNA recognition motif; KH, K homology. f, Bubble chart of YC-1-binding proteins displaying enrichments based on the Gene Ontology (GO) molecular function and biological process databases (Methods); dsRNA, double-stranded RNA; LSM, like Sm; ZnF, zinc finger; FYVE, Fab 1-YOTB-Vac 1-EEA1; PABP_HYD, polyadenylate-binding protein/hyperplastic disc protein. The bar graph (right) depicts enrichment among different classes of RNA-binding domains. Significance was calculated as adjusted P value using a two-sided Fisher’s exact test and the Benjamini–Hochberg method for correction for multiple hypothesis testing. Adjusted P values of <0.05 were considered statistically significant. Immunoblotting and immunofluorescence experiments in b and c were performed two times with similar results. g, Graph showing correlation between specific YC-1 binding score for proteins detected in YC-1 pulldowns and mRNA expression of the associated gene; TPM, transcripts per million. In e and g, the color code indicates proteins with common RNA-binding domains identified by EnrichR analysis.

Source data

N-BIC was previously found to covalently bind to proteins in the cytosol29, RITA was found to to cross-link DNA and proteins34, and aminoflavone and Phortress were found to form DNA adducts35,36. Accordingly, we sought to determine whether YC-1 covalently binds intracellular molecules in a SULT1A1-dependent manner. First, we explored potential YC-1–protein adduct formation by dot blot analysis of nucleic acid-free protein extracts from cells treated with YC-1–biotin or DH-YC-1–biotin. Probing blots with streptavidin revealed enriched binding to YC-1–biotin, which was greatly augmented after SULT1A1 overexpression (Extended Data Fig. 6a). Subsequent analysis of cells treated with YC-1 derivatives revealed a temporal increase in covalent binding of YC-1–biotin (Fig. 6b). Immunofluorescence using a streptavidin–FITC probe also showed progressive accumulation of YC-1–biotin in the cytosol and subsequent nuclear intensification, reinforcing the covalent nature of YC-1 binding to protein targets (Fig. 6c and Methods). Furthermore, YC-1–biotin binding was largely abolished by YC-1 parent competition or DCNP inhibition of SULT1A1 catalytic activity, indicative of protein binding specificity and its dependence on SULT1A1 (Fig. 6d). By contrast, we failed to observe evidence of YC-1–DNA adduct formation in studies in which we either extracted DNA from YC-1–biotin-treated cells and performed DNA dot blots (probed with streptavidin) or extracted DNA from YC-1 parent-treated cells and tested for hydrolyzed nucleic acids via liquid chromatography–mass spectrometry (Methods).

We next sought to identify the amino acid residue(s) in proteins that are conjugated by YC-1. The YC-1–biotin-bound proteome was isolated by streptavidin bead affinity purification after 1 day (d) of treatment and was then subjected to complete proteolytic digestion. MS revealed strong detection of YC-1–biotin conjugation to lysine residues, followed by serine and asparagine, compared to control DH-YC-1–biotin samples (Extended Data Fig. 6b and Methods). The side chain of each differentially conjugated amino acid residue contains a nucleophilic nitrogen (for example, amine in lysine) or oxygen (for example, hydroxy in serine) that can react and form a covalent bond with the electrophilic intermediate of YC-1 (Extended Data Fig. 6c). Thus, we conclude that sulfonated YC-1 binds cellular proteins, most prominently via covalent linkage with the side chain of lysine residues.

We used a chemoproteomic approach to identify proteins covalently bound by YC-1. Cells were treated with YC-1–biotin or DH-YC-1–biotin for 8 h. Lysates were then subjected to streptavidin-based affinity purification in the presence of YC-1 parent compound or inactive YC-1 followed by TMT proteomics. Of 250 proteins detected by YC-1–biotin affinity pulldown, 51 were specifically bound compared to inactive DH-YC-1–biotin and were diminished after YC-1 parent competition (Fig. 6e, log2 (fold change (FC)) > 1, and Supplementary Table 9). Gene Ontology analysis demonstrated strong enrichment of RNA-binding proteins (28/51, odds ratio = 8.07), including mediators of RNA metabolism, splicing and translation (Fig. 6e,f, Supplementary Table 10 and Methods). There was no correlation between gene expression levels and selective YC-1 binding (Fig. 6g). Moreover, many classes of highly expressed genes showed no enrichment in binding, suggesting that YC-1 binding profiles were not indicative of protein abundance (Extended Data Fig. 6d and Methods). Interrogation of the InterPro protein domain database revealed specific enrichment of the RNA recognition motif, DEAD/H box and K homology RNA-binding domains (Fig. 6f, top right). Among the most differentially bound proteins (log2 (FC) = 2.84) was TAR DNA-binding protein (TARDBP or TDP-43), an RNA-binding factor implicated in various aspects of RNA processing. Notably, genes encoding TARDBP and other top-ranked YC-1 target proteins (the RNA-binding factors CNOT1 and DDX42) scored as essential genes in cancer cell lines based on CRISPR screens (Extended Data Fig. 6e, retrieved from https://depmap.org). Immunoblotting of proteins from YC-1–biotin affinity pulldown assays confirmed that TARDBP, CNOT1 and DDX42 and other candidate proteins bound avidly to YC-1–biotin and were competed in a dose-dependent manner by parent YC-1 (Extended Data Fig. 7a). We also further established that YC-1 directly binds TARDBP based on a reverse coimmunoprecipitation experiment (Extended Data Fig. 7b). Cells treated with YC-1–biotin (with or without competition by parental YC-1) or DH-YC-1–biotin were lysed, and TARDBP protein was immunoprecipitated with a validated antibody37. We confirmed that streptavidin detected YC-1–biotin in TARDBP immunoprecipitates but not inactive DH-YC-1–biotin and that parent YC-1 competition reduced the YC-1–biotin signal.

Consistent with defects in RNA-processing factors, YC-1-treated cells exhibited alterations in RNA splicing, including marked changes in intron retention, as revealed by RNA-sequencing analysis (Extended Data Fig. 7c). Moreover, functional assays with a TARDBP splicing reporter38 showed that YC-1 treatment impaired TARDBP-dependent RNA splicing in a SULT1A1-dependent manner (Extended Data Fig. 7d), whereas TARDBP protein levels were not consistently affected by YC-1 treatment. Thus, our data indicate that YC-1 preferentially targets specific classes of RNA-binding proteins, including splicing factors essential for cell viability (Extended Data Fig. 7e).

SULT1A1-dependent activity of alkylator compounds in vivo

We sought proof-of-concept evidence to support the potential of exploiting SULT1A1-dependent alkylators therapeutically. To this end, we tested in vivo drug response in xenografts generated with pairs of isogenic cell lines with or without SULT1A1 expression. SULT1A1+ and SULT1A1 (control) derivatives of the CCLP1 ICC cell line (Fig. 7a) were injected subcutaneously into immunodeficient mice, which were subsequently treated with YC-1 (50 mg per kg (body weight)) or vehicle after tumors reached ~100 mm3. Whereas the SULT1A1 tumors were insensitive to YC-1 treatment, the SULT1A1+ tumors regressed rapidly, with complete response within 8 d (Fig. 7b,c). To test for durability of benefit, treatment was halted after 14 d, and the mice were monitored for recurrence. There was a dramatic extension in survival despite this brief treatment course; the median survival of mice in the YC-1-treated SULT1A1+ group was 58 d versus <30 d for each of the other groups (44 d after treatment cessation versus <16 d; Fig. 7d). No significant loss of body weight was noted in the treated animals (Extended Data Fig. 8a). YC-1 treatment also reduced the growth of subcutaneous and orthotopic xenografts generated from the SULT1A1high human-derived ICC21 cell line (Fig. 7e,f; intratumor YC-1 levels are shown in Extended Data Fig. 8b). Moreover, TUNEL staining demonstrated that YC-1 provoked death of tumor cells but not adjacent normal liver (Extended Data Fig. 8c). There was no apparent liver damage assessed by body and liver weight, histology and plasma marker levels (Extended Data Fig. 8d,e). To extend these findings to other members of this class of alkylator compounds, we examined the efficacy of RITA (Fig. 5f,g,i) in an additional xenograft model that endogenously expressed SULT1A1 or had CRISPR-mediated SULT1A1 knockout (CORL105; Extended Data Fig. 9a). As in the case of YC-1, RITA was active against xenograft growth strictly in a SULT1A1-dependent manner (Extended Data Fig. 9b–d).

Fig. 7: SULT1A1 determines YC-1 efficacy in vivo.
figure 7

ad, SULT1A1+ and SULT1A1 (control) CCLP1 cells were implanted subcutaneously into NSG mice. a, Once tumors reached ~100 mm3, mice were treated with YC-1 (50 mg per kg (body weight)) or vehicle for 14 d. Mice were then monitored for disease progression in the absence of treatment; i.p., intraperitoneal. b, Graph of serial tumor volumes. Error bars represent mean ± s.d.; Rx, treatment. c, Waterfall plot of best tumor response under treatment; n = 6 mice per group, except the control YC-1 group (n = 5 mice). d, Survival analysis of mice during treatment and after cessation of treatment; n = 6 tumors per group, except the control YC-1 group (n = 5 tumors). e, ICC21 liver orthotopic xenografts were used to assess YC-1 efficacy. Mice were treated with YC-1 or vehicle for 14 d as above starting at a tested time point with observable liver mass. Liver and body weight ratio at each end point was used as a surrogate for tumor mass. The dashed line indicates the liver-to-body weight ratio of a healthy mouse liver. Statistical significance is annotated comparing treatment conditions. Data from two independent animals per group are shown. f, ICC21 subcutaneous xenografts were treated with YC-1 or vehicle as described above until the vehicle group reached the end point. Error bars represent mean ± s.e.m.; n = 6 independent animals per group.

Source data

In normal tissues, SULT1A1 is most highly expressed in the liver, followed by the intestine, lung and adrenal gland, with most other tissues lacking robust expression; moreover single-cell RNA sequencing revealed that hepatocytes are among the highest SULT1A1-expressing cell types across organs (Extended Data Fig. 10a; retrieved from https://www.proteinatlas.org/). Similarly, samples from individuals with primary HCC exhibited the highest overall expression of SULT1A1 mRNA among >80 cancer types in The Cancer Genome Atlas (TCGA; retrieved from https://www.cbioportal.org/), and hepato-cholangiocarcinoma and ICC ranked third and sixth, respectively (Extended Data Fig. 10b). To extend these data, we first validated the specificity of a SULT1A1 antibody (shown above; Extended Data Fig. 9a–c) and subsequently performed immunohistochemistry in human specimens. Within the hepatobiliary system, SULT1A1 is largely restricted to hepatocytes, with minimal expression in bile duct cells (Extended Data Fig. 10c). Accordingly, we observed distinct profiles of SULT1A1 expression after immunohistochemistry staining of tissue microarrays representing different hepatobiliary malignancies (HCC, n = 63; ICC, n = 118; ECC, n = 19; Fig. 8a). Ranking staining intensity as high, intermediate, low and no (negative) expression (Methods), we found that the majority of HCCs (92%) and ICCs (67%) had high or intermediate SULT1A1 levels compared to 22% of ECCs (P ≤ 0.0001, HCC versus ECC and ICC versus ECC). Considering only high levels of SULT1A1 expression, ICCs had the highest rate (43%), followed by HCC (25%) and ECC (11%). SULT1A1 staining was specific to the neoplastic cells rather than stromal populations (Fig. 8a,b). Thus, HCC and ICC frequently express high levels of SULT1A1 consistent with their liver lineage origins, highlighting the potential of harnessing SULT1A1-activatable compounds therapeutically.

Fig. 8: SULT1A1 is frequently expressed in tumor samples from individuals with liver cancer.
figure 8

a, Representative immunohistochemistry staining for SULT1A1 expression in human HCC and cholangiocarcinoma (ICC and ECC) samples showing examples of negative, low, medium and high expression. Semiquantitative measurements of staining intensity are shown in the pie charts on the right; n, number of samples from independent individuals examined; scale bar, 100 μm; IHC, immunohistochemistry. b, Representative immunohistochemistry staining for SULT1A1 expression in human ICC samples showing SULT1A1 expression in tumor cells (right) and adjacent normal liver hepatocytes (left). Six tissue cores from six cases were analyzed. Yellow dashed lines demarcate the tumor and adjacent normal liver areas, marked by an arrowhead and arrow, respectively; scale bar, 100 μm.

Source data

Discussion

Here, we used drug sensitivity screens, acquired resistance models and quantitative proteomics to identify the mechanism of action and define biomarkers of responsiveness for the small molecule YC-1. We show that YC-1 is potently active in vitro and in vivo against cancer cells expressing the liver lineage SULT1A1 enzyme. The YC-1 prodrug is converted by sulfonation into an electrophile that is selectively reactive with lysine residues in proteins, with enrichment for RNA-binding proteins. Using large-scale drug screening data and basal gene expression profiles of cell lines, we identified a series of other small molecules with common structural features that together represent a class of SULT1A1-dependent anticancer agents.

SULT1A1 is highly expressed in a considerable subset of ICCs and HCCs. Among ICC cell lines, SULT1A1 expression correlates with a gene expression signature suggestive of an intermediate differentiation state between the hepatocyte and bile duct lineages, with associated specific genomic alterations (involving the IDH1/IDH2, BAP1 and FGFR2 genes). Correspondingly, human ICC samples with these alterations have been reported to exhibit cholangiolar histology and coexpress hepatic progenitor, hepatocyte and biliary markers20,21,22,23,24,25,26. These observations are consistent with the expression of SULT1A1 in normal hepatocytes and the concept that liver cancer types represent a continuum between hepatocyte and biliary phenotypes, in line with the plasticity of these liver lineages10,11. HCCs and ICCs carry poor prognosis, often lack actionable mutations and, when present, show only moderate responses to targeted therapies. SULT1A1-activated anticancer drugs may offer a new avenue for treatment opportunities based on the expression of this biomarker.

We show that YC-1 binds selectively to cellular proteins, particularly via covalent linkage to lysine residues. Oncology applications of covalent inhibitors binding to cysteine and lysine have emerged in recent years. Refinement of the YC-1 scaffold may allow the development of SULT1A1-dependent covalent inhibitors with additional selectivity for protein targets. In this regard, we provide evidence that YC-1 has enriched binding to RNA-processing factors and causes aberrant RNA splicing. YC-1 derivatives could serve to expand the landscape of targetable RNA-binding proteins, taking advantage of covalency. TARDBP and DDX42 are among the most enriched YC-1-targeted RNA-binding proteins. Both are essential for cancer cell viability in vitro, are overexpressed in subsets of HCCs compared to normal liver tissue and show a positive correlation between their expression levels and poor prognosis in individuals with HCC39,40,41 (retrieved from https://www.proteinatlas.org). Derivatives of YC-1 could be explored as scaffolds for efforts to target these RNA-binding proteins. Nonetheless, we find that YC-1 binds many RNA-processing proteins, which challenges identification of cell death-inducing events.

Our SAR studies highlighted the role of furfuryl alcohol in YC-1 activity and suggested that modifications of other regions can potentially enhance sulfonation and improve pharmacokinetic properties (Fig. 5b and Supplementary Table 6). In addition to YC-1, we have identified a broader class of compounds that depend on SULT1A1-mediated sulfonation for their activity against cancer cells. These compounds contain similar chemical moieties that can be sulfonated directly or after simple metabolic conversion to activate their alkylating properties. Outside the region of sulfonation, these compounds differ in overall chemical structure, which confers distinct target binding properties (for example, based on the reported profiles of RITA and N-BIC compared to YC-1; Extended Data Fig. 5d)29,34. Using these leads with fragment-based discovery approaches could expand the landscape of targetable proteins via covalent binding.

In summary, we present a set of small molecules active against SULT1A1-expressing tumor cells. Further development of these agents could lead to prodrug approaches to target specific essential proteins in subsets of liver cancers. SULT1A1 expression transcends the genetic landscape and represents a common hepatic lineage marker, covering many liver cancers. Our data on the YC-1-bound proteome suggests the possibility of using these approaches to target RNA-binding factors. The other SULT1A1-activated compounds could provide a broader toolkit of covalent anticancer agents for additional cellular processes. Furthermore, there is an array of other human sulfotransferases (13 reported SULT family enzymes) with differing target specificity and expression patterns across normal tissues and cancer types19,40,41. Comparable strategies could be used to identify sets of small molecules that are activated by the distinct SULT family enzymes that are highly expressed in different cancer cells, leading to the development of new classes of anticancer agents.

Limitations of the study include uncertainty of the SULT1A1 expression level required to activate YC-1, which might complicate the use of SULT1A1 as a biomarker. Further investigation is also needed to pinpoint the molecular mechanism of YC-1-induced cell death from the many binding proteins identified. Moreover, because SULT1A1 is expressed in normal liver, intestine and lung, development of YC-1 derivatives with a more specific target spectrum and preferable toxicity profiles is warranted.

Methods

Ethics statement

Animal studies adhered to the Massachusetts General Hospital (MGH) Institutional Animal Care and Use Committee-approved protocol 2019N000116. Studies with human specimens were approved by the Office for Human Research Studies at Dana-Farber/Harvard Cancer Center (protocols 19-699, 14-046 and 02-240).

Cell culture

Cell line sources included Riken Bioresource Center (RBE, SSP25 and HUCCT1), Korean Cell Line Bank (SNU1079) and ECACC (CORL105). CCLP1 was provided by T. Whiteside (University of Pittsburgh). ICC2, ICC4, ICC5, ICC6, ICC7, ICC8, ICC12, ICC137, ICC19, ICC20, ICC21, ECC3 and GBC1 are derived from human-derived xenografts using previously described methods14. Cell counting was performed using trypan blue exclusion (quantified on a Countess automated cell counter; Invitrogen). Cell lines were authenticated by short tandem repeat DNA profiling and were tested regularly for mycoplasma (LookOut Mycoplasma PCR kit, Sigma, MP0035).

Screening libraries

Primary screening used the MIPE consisting of 1,912 compounds12, NCATS NPACT22 consisting of 5,099 compounds and a kinase inhibitor library (977 compounds; Supplementary Table 1).

Quantitative high-throughput screen

CCLP1, HUCCT1, RBE and SNU1079 cells were seeded into 1,536-well white-bottom plates using a Multidrop Combi peristaltic dispenser (Thermo Fisher) at 500 cells per well in 5 μl of medium. Screening was performed as described previously42, with cells treated with compound for 72 h and quantified by CellTiter-Glo (Promega) and ViewLux microplate imaging (PerkinElmer). See Supplementary Table 11 for the assay protocol.

Compound activity was determined by plotting concentration–response data for each sample and modeling by a four-parameter logistic fit, yielding IC50 and efficacy (maximal response) values as previously described42. Plate reads for each titration point were first normalized relative to positive control (2 mM bortezomib, 0% activity, full inhibition) and dimethylsulfoxide (DMSO)-only wells (basal, 100% activity). In-house informatics tools were used for data normalization and curve fitting. As in prior studies with the quantitative high-throughput screen, hits ranged widely in potency, and there was variation in the quality of the corresponding concentration–response curves (CRCs; based on efficacy and number of asymptotes). Samples associated with shallow curves or single-point extrapolated concentration responses were assigned as low-confidence actives. Classes –1.1 and –1.2 were highest-confidence complete CRCs (top and bottom asymptotes with efficacies of ≥80% and <80%, respectively). Classes –2.1 and –2.2 were incomplete CRCs (single asymptote with efficacies of ≥80% and <80%, respectively). Class 3 CRCs were active only at the highest concentration or were poorly fit. Class 4 CRCs were inactive (insufficient efficacy or no curve fit). AUC and curve fittings were used for activity comparison and identification of selective agents. High confidence hits were defined based on curve class –1.1, –1.2, –2.1 or –2.2, maximum response of >50% and an IC50 of <10 µM. Screening information is summarized in Supplementary Table 13.

YC-1 sensitivity profiling across >1,000 cancer cell lines

Authenticated cancer cell lines (1,022) from the Genomics of Drug Sensitivity in Cancer platform17 were screened with a nine-point twofold dilution series of YC-1 at the Center for Molecular Therapeutics at the MGH. Area under the dose–response curve and median inhibitory concentration were determined as previously described17. Cell lines sensitive to YC-1 were defined based on their ranked AUC with Z score < –1.3 and P < 0.10. The fraction of cell lines from each cancer type sensitive to YC-1 was calculated by dividing the number of those sensitive by the total number from that cancer type.

Chemistry and synthesis of YC-1 analogs

A detailed description of the chemical reagents and procedures used for the synthesis of YC-1 analogs and the testing for YC-1-conjugated deoxynucleobases and amino acids can be found in the Chemistry Methods Supplement.

Molecular modeling

The three-dimensional structure of SULT1A1 was obtained from the Protein Data Bank (PDB: 3U3M). The structure is complexed with the non-sulfated form PAP and 3-cyano-7-hydroxycoumarin. Before molecular modeling and docking, the protein structure was prepared using the Molecular Operating Environment (MOE; Chemical Computing Group). Hydrogens were added with standard protonation state. The modeled structure was energy minimized using the QuickPrep module in the MOE program. The active site was defined by the cocrystal ligand 3-cyano-7-hydroxycoumarin with a 4.5-Å pocket extension. YC-1 conformations were generated during MOE docking. Initial docking pose placement used Triangle Matcher and the London dG scoring function. Final pose refinement used Rigid Receptor and the GBVI/WSA dG scoring function.

Caspase 3/caspase 7 activity

Cells were seeded at 10,000 cells per well in 96-well plates. The next day, 1 μM YC-1 was added. After incubation with YC-1 for 24 h, caspase 3/caspase 7 activity was assessed using a Caspase-Glo 3/7 assay (Promega, G8090) according to the manufacturer’s protocol. Data are represented as mean ± s.d. between technical triplicates.

Crystal violet staining

Cells were seeded at 100,000 cells per well in six-well plates. The next day, 1 μM YC-1 was added. At specified time points, medium was aspirated, and cells were washed with PBS, fixed with ice-cold methanol for 20 min and stained with 0.5% crystal violet in 25% methanol for 20 min at room temperature. Cells were then rinsed in tap water.

Flow cytometry analysis

For cell cycle analysis, double thymidine block-synchronized cells were released into S phase ±YC-1 and labeled with 10 µM EdU for 30 min. Cells were treated with the Click-iT EdU Alexa Fluor 647 flow cytometry assay kit according to the recommended protocols (Thermo Fisher). Data acquisition was performed on a FACS LSRII apparatus equipped with the FACSDiva software (BD Biosciences). Our gating strategy is summarized in Supplementary Fig. 1.

In vitro resistance model

RBE and SNU1079 cells were plated in six replicates. Nine-step concentrations of YC-1 from IC10 to IC90 were calculated for the parental cells. These concentrations were used to serially treat cells, and concentrations were raised by one step once cell growth was observed for two passages. RBE cells were adapted after 2–3 months, with six independent YC-1-resistant clones exhibiting insensitivity to two orders of magnitude with greater YC-1 concentrations than the IC50 of parental RBE cells. SNU1079 cells were refractory to this assay, with no clones growing beyond a three-step increase in YC-1 concentration.

Plasmids and transduction

To generate SULT1A1-knockout cells, sgRNAs were cloned into pLentiCRISPRv2 (Addgene, 52961; see Supplementary Table 12 for the sequences). These plasmids were used to generate virus by transfection of HEK293T cells with pCMV-VSV-G (Addgene) and dCMV-dR8.91 packaging plasmids. Collected virus was filtered through 0.45-μm filters and used to spin-infect target cells with 8 μg ml–1 polybrene (Millipore, TR-1003-G) at 2,250 r.p.m. and 37 °C for 60 min. After 24 h, cells were selected in 2 μg ml–1 puromycin for at least 3 d, and pooled populations were first tested for SULT1A1 knockout via immunoblotting. Human SULT1A1 (variant allele V223M) was cloned from SNU1079 mRNA (forward primer 5′-ATCGAGATCTGCCACCATGGAGCTGATCCAGGACAC-3′ and reverse primer 5′-ATCGCTCGAGTCACAGCTCAGAGCGGAAGC-3′). cDNA was inserted into pMSCV-Blast. Because several SULT1A1 polymorphic variants are common in populations and may confer distinct substrate affinity, we also created constructs with the variants V220M, V223M and F247L and S44N, V164A and V223M. Site-directed mutagenesis (New England BioLabs) was used to create SULT1A1 expression vectors resistant to CRISPRv2 gRNA to reintroduce SULT1A1 into knockout cells while not affecting the amino acid sequence (Supplementary Table 12). Murine stem cell virus-derived plasmids were used to generate viruses in combination with pCL-ECO and pCMV-VSV-G packaging plasmids. Successfully transduced target cells were selected with 10 μg ml–1 blasticidin for 1 week.

Immunoblotting

Cell lysis, electrophoresis and immunoblotting were performed as described previously43 using 20 μg of lysates run on 10% to 12% SDS–PAGE gels or on 4–20% Bio-Rad Mini-PROTEAN precast gels (for YC-1 affinity binding studies). PVDF membranes (GE Healthcare) were probed with antibodies to SULT1A1 (PA5-81053, Thermo Scientific; 1:5,000 dilution) or β-actin (Sigma, A5316; 1:10,000 dilution), TARDBP (Proteintech, 10782-2-AP; 1:2,000 dilution), DDX42 (Bethyl Laboratories, A303-353A-T; 1:1,000 dilution), CNOT1 (Proteintech, 14276-1-AP; 1:1,000 dilution), PTBP1 (Proteintech, 12582-1-AP; 1:5,000 dilution), ELAVL1 (Proteintech, 11910-1-AP; 1:5,000 dilution), P4HB (Cell Signaling Technology, 3501S, clone C81H6; 1:2,000 dilution), ANLN (Bethyl Laboratories, A301-406A-T; 1:5,000 dilution), VIM (Cell Signaling Technology, 5741S, clone D21H3; 1:1,000 dilution), ACTN4 (Cell Signaling Technology, 15145S, D7U5A; 1:1,000 dilution), MYH9 (Cell Signaling Technology, 3403S; 1:1,000 dilution) or HSP90AA1 (Santa Cruz Biotechnology, sc-13119, F-8; 1:500 dilution). Detection was performed with horseradish peroxidase (HRP)-conjugated secondary antibodies (Vector Laboratories: anti-rabbit, PI-1000-1, 1:10,000 dilution; anti-mouse, PI-2000-1, 1:10,000 dilution) and SuperSignal West Pico luminol/enhancer solution (Thermo Scientific).

Dose–response assays

Responses to drug were assessed by plating cells in 96-well plates. Growth was quantified using an MTT colorimetric assay read at 490 nm. Each assay was performed at least twice, with the exception of the studies in Fig. 2b, in which case, multiple independent YC-1-resistant lines and replicate parental lines were analyzed in a single assay. IC50 curves were generated from two biological replicates (except for Fig. 2b, using technical replicates) and analyzed with GraphPad Prism 8.

SULT1A1 activity assay

A colorimetric assay for SULT1A1 activity was adapted from Rothman et al.29. MES buffer (pH 7.5), p-nitrophenyl sulfate (5 mM), test substrate (YC-1 or YC-1 derivatives) and PAPS (0.02 mM) were added to a 96-well plate. The reaction was initiated via the addition of recombinant SULT1A1 (20 ng μl–1 or 580 nM) and monitored over time at an absorbance of 405 nm.

Computational identification of SULT1A1-activatable compounds

Identification of compounds with similar response profiles to YC-1 and/or with a correlation between SULT1A1 expression and sensitivity was performed using the NCI-60 database, the PRISM lab at the Broad Institute32, the Cancer Therapeutics Response Portal (CTRP) at the Broad Institute and the Genomics of Drug Sensitivity in Cancer (GDSC) Project at the Sanger Institute17. For analysis of the PRISM (https://depmap.org), CTRP and GDSC databases, the input was YC-1 sensitivity profiles across all cancer cell lines (Supplementary Table 3), and the output was compounds from each database with Pearson correlation score to YC-1 profiles. For analysis on the DTP NCI-60 database, Cellminer (https://discover.nci.nih.gov/cellminer/) was queried with an input of YC-1 (NSC 728165) for similar sensitivity patterns to YC-1 and with SULT1A1 mRNA levels for identification of potential SULT1A1-activatable compounds. The top ~150 correlating compounds from these queries were manually curated to identify a putative chemical moiety for SULT1A1 sulfonation and to group by structural features.

Immunofluorescence of intracellular YC-1–biotin

The predicted covalent binding of YC-1 suggested opportunities to track its cellular uptake and localization via immunofluorescence. Briefly, cells were seeded into six-well plates on collagenized glass coverslips. YC-1–biotin-treated cells were washed three times with PBS, fixed at room temperature in 4% paraformaldehyde in PBS for 15 min with light agitation, washed three times with PBS, permeabilized and blocked for 30 min with 1% whole goat serum in 0.1% Tween in PBS (PBS-T). Next, DAPI (Molecular Probes) and streptavidin, Alexa Fluor 488 conjugate (Thermo Fisher, S11223), was added for 30 min in PBS-T with light agitation at room temperature. Cells were washed three times with PBS and mounted with ProLong Gold Antifade reagent (Molecular Probes). A Nikon Eclipse Ti inverted fluorescence microscope with an oil immersion ×60 objective was used for imaging. Linear range of intensity and no thresholding was used for acquired images. Consistent filter settings for DAPI and 488 FITC channels were used for sequential scans.

Quantitative proteomics

Cells were lysed and prepared for tryptic digest as previously described44. Peptides (50 µg) were labeled using TMT reagents (Thermo Fisher), combined and fractionated using basic reversed-phase high-performance LC. Fractions were analyzed by reversed-phase LC–MS2/MS3 for 3 h on an Orbitrap Fusion or Lumos. MS3 isolation for quantification used simultaneous precursor selection, as previously described45. MS2 spectra were assigned using SEQUEST by searching against the UniProt database on an in-house-built platform. A target–decoy database-based search was used to filter the false-discovery rate (FDR) of protein identifications of <1% (ref. 46). Peptides that matched to more than one protein were assigned to that protein containing the largest number of matched redundant peptide sequences following the law of parsimony. TMT reporter ion intensities were extracted from the MS3 spectra, selecting the most intense ion within a 0.003-m/z window centered at the predicted m/z value for each reporter ion, and spectra were used for quantification if the sum of the S/N values of all reporter ions divided by the number of analyzed channels was ≥20 and the isolation specificity for the precursor ion was ≥0.75. Protein intensities were calculated by summing the TMT reporter ions for all peptides assigned to a protein. Intensities were first normalized using a bridge channel (pooled digest of all analyzed samples in an experiment) relative to the median bridge channel intensity across all proteins. In a second normalization step, protein intensities measured for each sample were normalized by the average of the median protein intensities measured across the samples.

For affinity-enriched proteomics profiling, after washing, beads were resuspended in 50 mM HEPES (pH 8.5), reduced and alkylated. Urea solution (8 M) was added to a final concentration of 1 M. After tryptic digest, one-third of the resulting peptides of each sample were labeled using TMT-10plex reagents. Labeled samples were combined and analyzed in a 3-h reversed-phase LC–MS2/MS3 run on an Orbitrap Lumos.

Testing for YC-1-conjugated deoxynucleobases and amino acids

DNA adducts

We adapted published methods47 to test whether YC-1 forms DNA adducts. YC-1-treated RBE cells were lysed, and deoxynucleobases were released from DNA by enzymatic cleavage of glycosidic bonds. Samples were analyzed by LC–MS to detect the presence of molecular species with predicted m/z values of YC-1 conjugating to each of the four deoxynucleobases. While the method demonstrates high sensitivity detecting trace amounts of colibactin DNA adduct47, we did not observe evidence supporting YC-1 conjugation to DNA. In addition, the extracted DNA from YC-1–biotin-treated RBE cells with SULT1A1 overexpression was subjected to dot blotting for affinity detection of YC-1–biotin DNA adducts using streptavidin–HRP. There was no signal of streptavidin–HRP retained on the DNA-spotted nylon membrane.

Protein adducts

Nucleic acid-free protein extracts were generated from YC-1–biotin- and DH-YC-1–biotin-treated cells, as described above, and were subjected to dot blotting and detection with streptavidin–HRP (Fig. 6b and Extended Data Fig. 6a). LC–MS was used to detect YC-1-conjugated amino acids as detailed in the Chemistry Methods Supplement.

Computational analysis of the YC-1-binding proteome

Fifty-one significant YC-1-binding proteins were filtered by a binding score of log2 (FC) (immunoprecipitation/control) > 1. Immunoprecipitation/control denotes the ratio of abundance of each protein pulled down by YC-1–biotin relative to inactive YC-1–biotin treatment. Significant binders were analyzed for Gene Ontology by the gene set enrichment analysis (GSEA) tool EnrichR (https://maayanlab.cloud/Enrichr/). To generate Fig. 6f (left), the top enriched terms from the Gene Ontology biological process and Gene Ontology molecular function databases were plotted, with bubble size indicating the significance score of –log10 (FDR) (Supplementary Table 10). The bar graph in Fig. 6f was based on an integrative analysis using the InterPro protein domain database by EnrichR. For comparative analysis between YC-1 binders and most abundantly expressed genes (Extended Data Fig. 6d), enriched terms from Gene Onology biological process and Gene Ontology molecular function databases were derived from EnrichR using the 500 most abundantly expressed genes based on RNA-sequencing data. The odds ratios of the enriched terms among YC-1-binding proteins were compared with the enriched terms derived using the most abundantly expressed genes. The graph in Fig. 6g was generated by selecting and graphing the most contrasting terms between YC-1 binders and the most abundantly expressed genes.

YC-1–biotin affinity enrichment of proteins

RBE cells overexpressing SULT1A1 (RBE CSK2 R4) were treated with YC-1–biotin, YC-1–biotin + YC-1 parent competition or inactive DH-YC-1–biotin for 7–8 h. Cell lysates were prepared in nucleic acid-depleting buffer (137 mmol liter–1 NaCl, 1% NP-40, 20 mmol liter–1 Tris (pH 8.0), 1 mM MgCl2, 1 mM CaCl2 and 1:500 benzonase from Millipore, 70746) containing protease inhibitors (complete, Roche) and phosphatase inhibitors (phosphatase inhibitor cocktail sets I and II, Calbiochem). After a BCA protein assay (Thermo Fisher Scientific), YC-1–biotin-bound proteins were enriched by incubating cell lysates with streptavidin-conjugated agarose beads (Thermo Fisher, 20347). After multiple denaturing washes, YC-1–biotin-bound proteins were either processed for MS by direct trypsin digestion or were eluted for affinity blotting by boiling with SDS sample buffer. For reverse coimmunoprecipitation, clarified protein lysate from RBE CSK2 R4 cells was incubated with protein G Dynabeads (Invitrogen, 10004D) conjugated to TARDBP antibody. Immune complexes were washed and analyzed via SDS–PAGE and western blotting. IgG antibody was used as a control.

RNA sequencing

RNA extracted from RBE and SNU1079 cells (treated with YC-1 or vehicle) using a Qiagen RNeasy Plus Mini kit was processed using the TruSeq Stranded mRNA library preparation kit (Illumina). Samples were run on a Nextseq 500 (Illumina). Reads were aligned to the human reference genome hg38 using STAR (v2.5.3a). Transcript levels were quantified using SALMON (v0.9.1). Count data extraction and normalization and comparison were performed using tximport and DESeq2, respectively (Bioconductor). To analyze RNA splicing, BAM output files from RNA-sequencing alignments were sorted and indexed using SAMtools. Insert length was calculated with pe_utils –compute-insert-len. Expression levels (psi) for retained introns and skipped exons were obtained using MISO48. Alternative event annotations of hg38 were generated by rnaseqlib. For filtering events, only events with 10 supporting reads for inclusion or exclusion isoforms and 20 supporting reads for all isoforms were used. The mean psi value of all filtered retained introns and skipped exons was used as the event score.

TARDBP splicing assay

We adapted a published TARDBP splicing assay38. A plasmid (Addgene, 107859) containing mEGFP fused to mCherry, interrupted by CFTR exon 9 (bound by and skippable with functional TARDBP) was introduced transiently into SULT1A1-overexpressing or control 293T cells. Following YC-1 (200 nM) or DMSO vehicle treatment, single-cell fluorescence images were captured with GFP (488-nm) and red fluorescent protein (RFP; 561-nm) lasers using a confocal Nikon A1R microscope and were analyzed with ImageJ. TARDBP splicing activity was calculated using the normalized ratio of RFP to GFP from over 500 cells using three to five images in triplicate experiments.

Xenograft studies

Immunodeficient mice (NOD-scid Il2rgnull (NSG) strain), age 6–10 weeks, were housed in pathogen-free animal facilities. Studies were under protocol 2019N000116 approved by the MGH Institutional Animal Care and Use Committee, whose regulations for maximum tumor size (<2 cm in greatest diameter) were strictly adhered to. CCLP1 cells (2 × 106 cells) exogenously expressing SULT1A1 or empty vector control were injected subcutaneously into recipient mice (both sexes). When tumor volume reached ~100 mm3, mice were randomly assigned to the YC-1 or vehicle group (five to six mice per group; efforts were made to balance sex). The mice were treated with intraperitoneal injection of YC-1 dissolved in DMSO (50 mg per kg (body weight) daily for 14 d). Tumor volumes were measured twice per week. When tumors became ulcerated or exceeded 1,000 mm3, mice were killed, and tumor samples were collected. For histology, tissue samples were fixed overnight in 10% formalin, embedded in paraffin, sectioned and stained with hematoxylin and eosin.

Orthotopic models were performed using 500,000 ICC21 cells injected into the liver49. Sex was not considered for selection of mice but was considered for balancing when grouping. Pilot studies were conducted to define the engraftment and growth kinetics of the orthotopic tumor model, showing that tumors developed by 6 weeks and reached the end point by 8 weeks. Thus, treatment studies were initiated at 6 weeks after injection. Researchers were not blinded during the conduct of the experiments. Both sexes of mice were used and showed similar tumor growth.

Histology and immunostaining

Sample fixation, embedding, sectioning and staining were performed by iHisto as described previously49. For antigen unmasking, specimens were heated in a 2100 Antigen Retriever (Aptum Biologics) in antigen unmasking solution (H-3300, Vector Laboratories), rinsed three times with PBS-T, incubated for 10 min with 1% H2O2 at room temperature and washed three times with PBS-T. After blocking (5% goat serum in PBS-T) for 1 h, tissue sections were incubated at 4 °C overnight with anti-SULT1A1 (Thermo Fisher, CF501838, clone OTI1G10) diluted 1:200 in blocking solution. Samples were washed three times for 3 min each in PBS-T and incubated with MACH 2 rabbit HRP–polymer (Biocare Medical, RHRP520) for 1 h at room temperature. Slides were stained for peroxidase for 3 min with the DAB substrate kit (Biocare Medical, DS900), washed with water and counterstained with hematoxylin. TUNEL staining (R&D Systems, 4810-30-K) was performed according to the manufacturer’s protocol. Slides were photographed with an Olympus DP74 microscope. SULT1A1 staining intensity was evaluated semiquantitatively in tumor slides by a gastrointestinal cancer pathologist (V.D.) who was blinded to the origin of the tissue. Tissue microarrays (3-mm cores) were constructed from resected human samples (N = 200 individuals). Information on sex and age was not available. These studies were approved by the Institutional Review Board in the Office for Human Research Studies at Dana-Farber/Harvard Cancer Center under protocols 19-699, 14-046 and 02-240.

Statistics and reproducibility

Data distribution was assumed to be normal, but this was not formally tested. No statistical methods were used to predetermine sample sizes, but our sample sizes are similar to those reported in previous publications43,50. Pathology and immunohistochemistry allocations were blind to the gastrointestinal cancer pathologist during semiquantitative outcome assessment. Other data collections and analyses were not performed blinded to the conditions of all experiments. No data were excluded from the analyses, and randomization was limited to the in vivo experiments. Experimental results were reproducible across multiple (two or more) independent biological replicates, shown with two to three replicates.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.