Discovery of first-in-class inhibitors of ASH1L histone methyltransferase with anti-leukemic activity

ASH1L histone methyltransferase plays a crucial role in the pathogenesis of different diseases, including acute leukemia. While ASH1L represents an attractive drug target, developing ASH1L inhibitors is challenging, as the catalytic SET domain adapts an inactive conformation with autoinhibitory loop blocking the access to the active site. Here, by applying fragment-based screening followed by medicinal chemistry and a structure-based design, we developed first-in-class small molecule inhibitors of the ASH1L SET domain. The crystal structures of ASH1L-inhibitor complexes reveal compound binding to the autoinhibitory loop region in the SET domain. When tested in MLL leukemia models, our lead compound, AS-99, blocks cell proliferation, induces apoptosis and differentiation, downregulates MLL fusion target genes, and reduces the leukemia burden in vivo. This work validates the ASH1L SET domain as a druggable target and provides a chemical probe to further study the biological functions of ASH1L as well as to develop therapeutic agents.

D � The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement D � A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly □ l'v'I The statistical test(s) used AND whether they are one-or two-sided iL::,J Only common tests should be described solely by name; describe more complex techniques in the Methods section.
� D A description of all covariates tested � D A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons D � A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) □ l'v'I For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted iL::,J Give P values as exact values whenever suitable.
� D For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings � D For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes � D Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability The coordinates for ASH1L-AS-5 and ASH1L-AS-85 complexes were deposited in PDB under PDB codes 6X0P (https://www.rcsb.org/structure/ unreleased/6X0P) and 6WZW (https://www.rcsb.org/structure/unreleased/6WZW), respectively. RNA-seq and CUT&RUN data were deposited to GEO under the accession number GSE150087 (SubSeries numbers: GSE150085 and GSE150086 for the RNA-seq and CUT&RUN data, respectively; https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE150087).

Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.
Sample size Sample sizes were not predetermined using statistical software Data exclusions No data were excluded

Replication
Cell based experiments were performed at least twice with two to three technical replicates in each experiment, as indicated in the figure legends. All attempts were successful. RNA-Seq experiment was performed once with three technical replicates. CUT&RUN experiment Randomization was performed once with one replicate. In vivo study, control group had 6 mice, treatment group had 7 mice. Mice grouping was based on the bioluminescence level of individual mice. Experiments were not blinded or randomized.

Blinding
No blinding was used. Experimental results were obtained by automated methods (e.g. qRT-PCR, mTT read-out, etc. )

.)
.) We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. .I

Eukaryotic cell lines
All antibodies were obtained commercially and had been validated by the companies. The information can be accessed through the manufacturer's websites using the catalog numbers provided in the Method section.

Policy information about cell lines
Cell line source(s) MV4;11, K562, KOPN8 and RS4;11 cell lines were obtained from ATCC. MOLM13 and SET2 cell lines were obtained from DSMZ. MLL-AF9, MLL-AF6, E2A-HLF and HM-2 cells were generated by transforming murine progenitor cells with indicated oncogenes (details are provided in the manuscript/ previously described); human CD34+ hematopoietic cord blood cells were purchased from Stem Cell Technologies Authentication Mycoplasma contamination Authentication of human cells was performed by supplying vendors (ATCC using DNA fingerprinting and DSMZ by STR profiling as described on ATCC and DSMZ web pages). No additional authentication of these cell lines was performed. Murine cell lines transformed with oncogenes were confirmed by colony formation.
All cell isolates tested negative for mycoplasma. This study did not involve field-collected animals.

University of Michigan Committee on Use and Care of Animals and Unit for Laboratory Animal Medicine
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Methodology
Sample preparation Instrument Software For the cells isolated from the mice transplanted with MV4-ll, cells from peripheral blood or spleen were isolated and suspended in PBS with 1% FBS for flow cytometry analysis. Red blood cells were lysed with ACK (Lonza). Cells from human cell lines were washed in PBS with 1% FBS (flow buffer), stained 30 minutes, washed twice with flow buffer, and used for flow cytometry analysis.

FACSCelesta flow cytometer, Becton-Dickinson, FACSCelesta
The data was collected using BD FACSDiva version 8, and data was analyzed using Flowjo, version 10.6.0 Cell population abundance No cell sorting was performed Gating strategy FSC/SSC were used for gating out the debris, the FSC A/FSC W were used for single cell selection. For the hCD45 gating in the samples derived from mice, the non-stained samples served as a negative control. For gating in experiments using human cell lines, the non-stained samples were used as a reference. Gating strategy is presented in Supplementary Information: Supplementary Fig. 8c, 8d; Supplementary Fig. 11e and Supplementary Fig. 12c.. J::8:1 Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information.