Impact of FLT3-ITD location on cytarabine sensitivity in AML: a network-based approach

,


Mass spectrometry analyses
The peptides and the phosphopeptides were desalted on StageTips and separated on a reverse phase column (50 cm, packed in-house with 1.9-mm C18-Reprosil-AQ Pur reversed-phase beads) (Dr Maisch GmbH) over 120 min or 140 min (single-run proteome and phosphoproteome analysis respectively). After elution, peptides were electrosprayed and analyzed by tandem mass spectrometry on a Q Exactive Orbitrap (Thermo Fischer Scientific). Settings: 3E6 as AGC target, maximal injection time of 20 ms, and 120,000 resolution at 200 m/z. A data-dependent Top20 mode with sub sequent acquisition of higher-energy collisional dissociation (HCD) fragmentation MS/MS spectra of the top 20 most intense peaks. For MS/MS spectra, resolution: 15,000 at 200 m/z; 1E5 as AGC target; injection time: 20 ms; isolation window: 1.6Th.

MS Data processing
Raw mass spectrometry data were analyzed in the MaxQuant environment version 1.5.1.6, employing the Andromeda engine for database search. Proteome and phosphoproteome samples were analysed together by specifying two separate groups and setting group specific parameters for each sample type. MS/MS spectra were matched against the Mus musculus UniProtKB FASTA database (September 2014), with an FDR of < 1% at the level of proteins, peptides and modifications. Enzyme specificity was set to trypsin, allowing for cleavage N-terminal to proline and between aspartic acid and proline. The search included cysteine carbamidomethylation as a fixed modification. Variable modifications were set to N-terminal protein acetylation and oxidation of methionine as well as phosphorylation of serine, threonine tyrosine residue (STY) for the phosphoprotemic samples.
MaxQuants Label free Quantification method and a minimum ration count of two was used for the total proteome samples. For proteome and phosphoproteome analysis, where possible, the identity of peptides present but not sequenced in a given run was obtained by transferring identifications across liquid chromatography (LC)-MS runs. For phosphopeptides identification, an Andromeda minimum score and minimum delta score threshold of 40 and 17 were used, respectively. Peptides had to be fully tryptic in both proteome or phosphoproteme samples and up to two or four missed cleavages were allowed for protease digestion, respectively.

Proteome and Phosphoproteome Bioinformatics Data Analysis
Bioinformatic analysis was performed in the Perseus software environment (3). Statistical analysis of proteome and phosphoproteome were performed on logarithmized intensities for those values that were found to be quantified in any experimental condition. Phosphopeptides intensities were normalized by subtracting the median intensity of each sample. Student t-Test with a permutationbased FDR cutoff of 0.07 and S0 = 0.1 was performed to identify significantly modulated proteins and phosphopetides between two different conditions. Categorical annotation was added in Perseus in the form of GO biological process (GOBP), molecular function (GOMF), and cellular component (GOCC), KEGG pathways and kinase substrate motifs (extracted from HPRD). Concerning the kinase substrate motifs, we performed a 1D annotation enrichment analyses to identify statistically significant enriched kinase-substrates motifs (17). Multiple hypothesis testing was controlled by using a Benjamini-Hochberg FDR threshold of 0.05.

Signaling Profiler
The Signaling Profiler code and documentation is available as R package at https://github.com/SaccoPerfettoLab/SignalingProfiler/.

Step 1) Inference of protein activities modulation upon Ara-C treatment
We used three different methods in Signaling Profiler to infer the Ara-C induced activity modulation of key signaling proteins: i) VIPER score: we used run_footprint_based_analysis function to infer the activity of kinases and phosphatases from phosphoproteomics data, setting analysis = 'ksea', reg_minsize = 1, exp_sign = FALSE (we consider the whole omic dataset) and correcting VIPER results through hypergeometric test (hypergeom_corr = TRUE); ii) PhosphoSCORE: we used phosphoscore_computation function to infer the activity of phosphoproteins being target of (de)phosphorylation modification, setting organism = 'hybrid' to use human regulatory orthologs phosphosites in addition to mouse phosphosites; iii) Proteoscore: we used activity_from_proteomics function to exploit the experimental fold-change in proteomic data as a proxy of activity modulation, setting organism = 'mouse'.
This process allowed us to predict the activity of 47 and 51 kinases, 4 and 5 phosphatases, 18 and 26 transcription factors and 158 and 154 other entities, in FLT3 ITD-JMD and FLT3 ITD-TKD cell line, respectively.
The complete list of inferred protein activities is provided in Supplementary Table S3.
Step 2) Cellspecific naïve causal network generation We derived a naïve network from the Signaling Profiler built-in 'mouse' prior knowledge network ii) kinases and phosphatases to their direct phosphorylated targets, using get_all_shortest_path_custom, setting path_length = 'one'. Step

3) Cell specific causal network generation through CARNIVAL
We optimized the naïve network on inferred protein activities using CARNIVAL algorithm (4). In this step, we filter the naïve network retaining only causal paths coherent with the activity of start and end nodes. We exploited the run_carnival_and_create_graph Signaling Profiler function. We set DNA_DAMAGE as source node and we assigned the activity = 1, since Ara-C activates this phenotype. We set as target nodes all the inferred proteins present in the naïve network. We used as ILP solver cplex algorithm (4).
We obtained two cellspecific network linking DNA damage to key signaling Ara-C modulated proteins, namely the FLT3 ITD-JMD model (139 nodes and 172 edges) and the FLT3 ITD-TKD model (154 nodes and 195 edges).

Reactive oxygen species (ROS) levels were analyze using CellROX™ Deep Red Flow Cytometry
Assay Kit according to the kit instruction (Cat. C10491, Thermo Fisher Scientific). Briefly, CellROX (500nM) was added to cell cultures and incubated at 37°C for 60 minutes in the dark. 1x10 6 cells were collected and washed in ice-cold PBS twice, then samples were immediately analyzed by flow cytometry. The data obtained were analyzed by CytExpert software.

Figure S1
Figure S1. FLT3 ITD-TKD mutation confers resistance to Ara-C exposure in 32D cell line.
Annexin-V apoptosis assay of 32D cells exposed to 20 μΜ of Ara-C for 24h. Data are presented as percentage of apoptotic cells obtained from three independent experiments. Statistical analysis was performed by ANOVA test (*p < 0.05).       Whole cell lysates of Ba/F3 cells exposed to Ara-C 20 μΜ for 24h were subjected to Western blotting to measure DDR protein levels for the indicated antibodies. Actin was used as loading control.
Representative images of three independent experiments are reported. l.e.= short exposure, h.e.= long exposure.    A) Ba/F3 cells were treated with increasing doses of Ara-C and/or THZ1 for 24 hours. Cell viability was assessed by MTT assay. Data are presented as fold change on control condition of optical density measurements (OD = 595nm) obtained from three biological replicates. ***p < 0.001, ****p < 0.0001; ANOVA test. B) Heatmaps representing the dose response to Ara-C and THZ1 treatment in FLT3-ITD cells, generated by using the SynergyFinder tool. C) Heatmaps representing the synergy score of Ara-C and THZ1 drugs in FLT3-ITD cells, generated by using the SynergyFinder tool.
Supplementary Tables   Table S1. Quantified proteins in proteomic analysis (mass spectrometry). Table S2. Quantified phosphopeptides in phosphoproteomic analysis (mass spectrometry). Table S3. The complete list of protein activities inferred in the first step of "Signaling Profiler" pipeline. Table S4. GO Biological Process terms enriched from list of proteins in FLT3 ITD-TKD specific model obtained with gProfiler.