Synergistic drug combinations and machine learning for drug repurposing in chordoma

Chordoma is a devastating rare cancer that affects one in a million people. With a mean-survival of just 6 years and no approved medicines, the primary treatments are surgery and radiation. In order to speed new medicines to chordoma patients, a drug repurposing strategy represents an attractive approach. Drugs that have already advanced through human clinical safety trials have the potential to be approved more quickly than de novo discovered medicines on new targets. We have taken two strategies to enable this: (1) generated and validated machine learning models of chordoma inhibition and screened compounds of interest in vitro. (2) Tested combinations of approved kinase inhibitors already being individually evaluated for chordoma. Several published studies of compounds screened against chordoma cell lines were used to generate Bayesian Machine learning models which were then used to score compounds selected from the NIH NCATS industry-provided assets. Out of these compounds, the mTOR inhibitor AZD2014, was the most potent against chordoma cell lines (IC50 0.35 µM U-CH1 and 0.61 µM U-CH2). Several studies have shown the importance of the mTOR signaling pathway in chordoma and suggest it as a promising avenue for targeted therapy. Additionally, two currently FDA approved drugs, afatinib and palbociclib (EGFR and CDK4/6 inhibitors, respectively) demonstrated synergy in vitro (CI50 = 0.43) while AZD2014 and afatanib also showed synergy (CI50 = 0.41) against a chordoma cell in vitro. These findings may be of interest clinically, and this in vitro- and in silico approach could also be applied to other rare cancers.

Drug repurposing or repositioning is an approach whereby new therapeutic uses for existing drugs or clinical candidates are identified [10][11][12][13][14] . High throughput screens, virtual screening or serendipitous observations are employed to enable drug repurposing 13 . For example we have previously identified approved drugs active against the Ebola virus 15 and Chagas Disease 16 using Bayesian and other machine learning models. In addition, there are several ongoing efforts to demonstrate new uses for molecules that have been through clinical trials for other uses but were subsequently shelved. One such example is the NIH NCATS industry-provided assets that could be potentially repurposed (https ://ncats .nih.gov/ntu/asset s/curre nt). We have now developed a strategy for virtual screening such compounds then testing in vitro and will describe this approach applied to chordoma.
Further, two FDA-approved kinase inhibitor drugs-palbociclib, a breast cancer drug, and afatinib (Fig. 1A,B), a non-small cell lung carcinoma drug-have shown equally robust efficacy in patient derived xenograft (PDX) and cell-line derived xenograft (CDX) models of chordoma. Palbociclib 17 and afatinib 18 were designed to specifically target different kinases, CDK4/6 and EGFR, respectively. The utility of each in preclinical chordoma models implies that multiple oncogenic biological pathways may drive chordoma as is true in many other cancers.
We hypothesized that combinations of clinical kinase inhibitors in chordoma models and patients may act additively or synergistically by dampening oncogenic signaling in multiple pathways. Both primary targets and secondary targets can play a role in this. A combination therapy of afatinib and palbociclib is of particular interest but there may also be many other potential combinations of kinase inhibitors. Chordoma PDX and CDX mouse models respond equally well to afatinib and palbociclib, though these drugs target divergent and minimally overlapping regions of the kinome 19 . A combination of EGFR/CDK inhibitors, i.e., afatinib/palbociclib, may target multiple oncogenic signaling pathways simultaneously. Following the same rationale, we now evaluate the in vitro efficacy of EGFR/CDK inhibitor combinations prior to future in vivo PDX and CDX mouse model studies. We envision that these proposed studies will enable and support future drug combination chordoma clinical trials.

Results
Machine learning models for chordoma drug discovery. Several recently published studies of compounds screened against chordoma cell lines 20,21 were used to generate Bayesian machine learning models with our Assay Central software 10,12,[22][23][24][25][26][27][28][29] . In one published chordoma study 1097 compounds were screened against 3 chordoma cell lines (U-CH1, U-CH2, MUG-Chor1) and 27 had chordoma selective cytotoxicity 20 and many of these were EGFR inhibitors. A more recent study from the Broad Institute and collaborators profiled 459 com-   Figure S1). These models were also used to score the 3 molecules not in these training sets ( Table 2 and Fig. 1C,D) and their scores would suggest the value of testing them in vitro. The atom coloring feature helps to suggest which molecule features correspond to activity with a particular model [e.g. with the EGFR model AZD2014 has a large area of the molecule colored green (favorable for activity) while the other molecules have fewer green colored atoms ( Fig. 2A-C)].
in vitro studies. We tested AZD2014 ( in UCH1 and UCH2 cell lines, along with known chordoma inhibitors afatinib and palbociclib, using reduction of resazurin to resarufin as an assay for metabolic activity of chordoma cell lines. As previously reported 18 , afatinib is a potent inhibitor of U-CH1, but not U-CH2. The Bayesian model we have termed 'EGFR' correctly predicted the rank order of these 3 compounds not in the training sets (Table 2). Of the drugs predicted by our model, AZD2014 was the most potent against both cell lines (IC 50 0.35 µM U-CH1 and 0.61 µM U-CH2, Table 3 and Fig. 3A,B), though maximal inhibition of U-CH2 cells was limited. AZD0530 and RDEA119 also showed moderate potency against U-CH1 (IC 50 1.5 µM and 2.4 µM, respectively) but not U-CH2. AZD4054 had low potency, but even at 11 µM this was unexpected given that its target, the endothelin A receptor, has no known functions in chordoma.
in vitro studies-combination studies of approved drugs. Next we tested the combinations of drugs used in this study to determine whether inhibition of multiple biological pathways would result in synergistic effects on chordoma cell lines. In order to determine this we calculated the combination index (CI) for each of the pairs of drugs mixed at different ratios, where CI < 1 indicates a synergistic effect. For each of the pairs of drugs used, CI was then calculated using five eight-point dose curves, each with a constant ratio of the two drugs.

Discussion
Epidermal Growth Factor Receptor (EGFR) is a receptor tyrosine kinase (RTK) 30 . Activation of EGFR leads to the phosphorylation of proteins in downstream signaling pathways, including the PI3k-Akt-mTOR and RAS-RAF-MEK-MAPK pathways 30 . Both of these pathways are critical in regulating cellular apoptosis, proliferation, migration, and survival. Broadly speaking, over-expression of EGFR, which is located on chromosome 7p12, can increase cellular proliferation and contribute to aggressive tumor behavior 31 . Within the molecular context Table 1. Chordoma Bayesian model statistics. Datasets were named as "Broad" 21 and "EGFR" 20 , and underwent curation to remove problematic molecules before model building. Data represent fivefold cross validation. The "Broad" dataset is named as such because the data came from a chordoma screen at the Broad Institute. The "EGFR" data set is so named because it came from a paper that highlighted the activity of EGFR compounds in chordoma and is not meant to construe a dataset made up entirely of EGFR compounds. Both datasets contain a wide variety of compounds that inhibit a broad range of targets.  8 . Several FDA-approved and clinical kinase inhibitors whose main therapeutic target is EGFR have been tested in preclinical in vitro and in vivo models of chordoma 33 . Gefitinib and erlotinib, two FDA-approved EGFR inhibitors used in the treatment of non-small-cell lung carcinoma (NSCLC), individually inhibited U-CH1, U-CH7, UM-Ch-SCor1, and MUG-Chor1 cellular proliferation in dose-dependent manners 20,34 . Afatinib, another FDA-approved EGFR inhibitor used against NSCLC, had anti-proliferative activity against all chordoma cell lines tested (IC 50 < 0.7 µM) except for JHC7 20 . Afatinib was also found to promote degradation of EGFR and brachyury, both of which are crucial to chordoma cell growth 20 . In vivo studies for erlotinib and afatinib have been reported. Erlotinib treatment significantly lowered tumor volume relative to controls in a PDX mouse model and reduced p-EGFR 9 . Afatinib treatment resulted in tumor growth inhibition in three PDXs and one CDX with no signs of toxicity in any of the mouse models 9 . In addition to this preclinical data, some clinical evidence exists to support EGFR and its therapeutic agents as potential chordoma treatments. A retrospective study found the median PFS was ~ 15.0 months for five patients treated with erlotinib 35 . For afatinib, there is currently an open clinical trial in the Netherlands (NCT03083678) to evaluate its efficacy against locally advanced and metastatic chordoma.  www.nature.com/scientificreports/ Cyclin-dependent kinase 4 (CDK4) and cyclin-dependent kinase 6 (CDK6) are serine-threonine kinases 36 . CDK4/6 are known oncoproteins, as upregulation of these kinases or inactivation of CDKN2A lead to cell cycle deregulation, increased cell proliferation, and tumorigenesis 37 . CDK4/6 functions downstream of a number of oncogenic pathways, implying that CDK4/6 inhibition may be effective when combined with inhibitors of the upstream pathways 38,39 . Combination treatment of an EGFR sensitive non-small cell lung cancer PDX model with a CDK4/6 inhibitor and an EGFR inhibitor showed combinatorial benefit, providing precedent for a similar application in chordoma tumors which are also sensitive to EGFR inhibition 40 . In another study, palbociclib sensitized lung cancer cells to treatment with EGFR inhibitors 41 . Importantly, in chordoma the CDK4/6 regulatory gene CDKN2A (also known as p16) is frequently lost. CDK4/6 are thus found in an over-activated state in chordoma 42 . From a gene expression perspective, CDK4 and CDK6 mRNAs have been detected in all eight chordoma cell lines that were interrogated: U-CH1, U-CH2, U-CH3, U-CH6, U-CH7, U-CH10, U-CH11, and U-CH12 43 . Likewise, CDK4 and CDK6 protein were observed in the same eight chordoma cell line as above 42 . An immunohistochemistry study of 25 patient samples showed CDK4 was overexpressed in 20% of cases 44 . Tissue microarray analysis of 85 samples from 72 chordoma patients found that ~ 98% of tissue samples expressed CDK4 42 . Additionally, the mean expression level of CDK4 was significantly higher for non-survivors than survivors at the time of publication 42 . There are three FDA-approved kinase inhibitors whose main therapeutic targets are CDK4/6. Palbociclib, one of these dual CDK4/6 inhibitors, has been tested extensively in preclinical in vitro and in vivo models of chordoma 43 . Palbociclib treatment resulted in decreased cell growth in a dose-responsive manner for all eight chordoma cell lines tested 43 . In PDX and CDX models, treatment with palbociclib resulted in significant inhibition of tumor growth in 5/6 individual models (Table S1). There is currently a clinical trial enrolling in Germany (NCT03110744) to evaluate the efficacy of palbociclib against locally advanced and metastatic chordoma.
mTOR is a serine threonine kinase which is a member of two protein complexes, mTOR complex 1 (mTORC1) and complex 2 (mTORC2). By virtue of these interactions it plays key roles in a variety of cellular processes including metabolism and proliferation. It lies in the PI3K-AKT-mTOR signaling pathway, and components of this pathway are implicated in a wide range of cancers. Chordoma is no exception. Analysis of 50 chordomas showed that a high percentage of tumors were positive for p-AKT, piTSC2, p-mTOR, total mTOR, p-P70S6K, p-RPS6, p-4EBPI and eIF-4E 45 . These are all linked to the complicated mTOR signaling pathway. The authors suggested that a majority of chordomas may respond to mTOR inhibitors or mTOR inhibitors in combination with other drugs. A loss of PTEN (a tumor suppressor which negatively regulates this PI3K-AKT-MTOR pathway) was also observed in 16% of these chordoma cases. A study of chordoma tumors from 111 patients demonstrated that three key proteins in the mTOR pathway (p4EBP1, pS6-Ser240/244, pS6-Ser235/236) were activated in 46% of the tumors 46 . In a different study of just skull base chordoma, expression of PI3 and AKT pathway genes was significantly upregulated in brachyury high expression tumors. Importantly, the transcription factor brachyury (gene name TBXT), a key driver of chordoma, seems linked to this pathway 47 . The U-CH2 cell line is inhibited by www.nature.com/scientificreports/ the dual PI3k/mTOR inhibitor BEZ-235, but not the mTORC1-specific inhibitor rapamycin 47 . Given our observations that U-CH2 is inhibited by AZD2014, a highly specific inhibitor of mTOR which inhibits both mTORC1 and mTORC2 activity, we hypothesize that mTORC2 has a critical role in promoting chordoma proliferation or survival. An analysis of sacral chordomas demonstrated that mTOR expression levels were significantly higher than in adjacent normal tissue 48 . The PI3K-mTOR pathway is also upregulated in the UCH-1 chordoma cell line. A PI3K/mTOR inhibitor inhibited both AKT and MTOR activation in this cell line. The compound inhibited proliferation and induced apoptosis 49 . Finally, the MTOR inhibitor MLN0128 (sapanisertib) decreased activity of the PI3K-AKT-MTOR pathway in vivo in a clival chordoma PDX model 50 . Taken together, clearly this pathway is consistently dysregulated in chordomas, and a promising avenue for targeted therapy. Additionally, it has been observed clinically that in one patient, rapamycin, an FDA-approved mTOR inhibitor, slowed the progression of a recurrent chordoma tumor 51 . The utility of these various kinase inhibitors in preclinical and clinical chordoma settings implies that multiple oncogenic biological pathways may drive chordoma. Based on these observations, we hypothesized that combinations of clinical EGFR, CDK, and mTOR inhibitors may act synergistically by dampening oncogenic signaling in multiple pathways. Using a combination of www.nature.com/scientificreports/ computational and in vitro approaches we have identified AZD2014 (targets mTORC1 and mTORC2), RDEA119 (targets MEK1/2), and AZD4054 (targets endothelin A receptor) as compounds which inhibit chordoma cell lines to differing extents. We also assessed several kinase drug combinations, of these, afatinib (EGFR) and palbociclib (CDK4/6) as well as afatinib and AZD2014 showed substantial synergy against U-CH1 cells in vitro. Since each of these pairs of compounds likely have kinome inhibition profiles that are distinct from one another 19 , co-dosing will allow us to target two key chordoma vulnerabilities simultaneously. Another advantage of working with these compounds is that pharmacokinetics data for in vivo studies is available [52][53][54][55] . There is a growing body of data on the effects of small molecules on the growth of chordoma cell lines 20,21 . The results of our studies indicate that machine learning approaches utilizing these data 20,21 can help identify compounds suitable for testing in chordoma cell line models, and that we have identified combinations of inhibitors that act on different pathways important for chordoma cell growth which can behave synergistically. This work paves the way for future in vivo evaluation of CDK / mTOR inhibitor combinations in animal models of chordoma and then in the clinic. A treatment for chordoma would fill a dire unmet clinical need, and repurposing of available medicines is likely the quickest approach to approved treatments.  . Plates were then incubated for a further 18hrs before reading using an Infinite F200 microplate reader with a Connect Stacker (Tecan Group Ltd). iControl software (Version 1.11) was then used to measure the fluorescence intensity of resarufin at EX535nm and EM595nm. The resulting relative fluorescence units are proportional to cellular redox activity, which is a common proxy for the quantity of living cells 56 .
After adjusting for the effects of DMSO in vehicle-only controls, raw fluorescence data was fitted to a fourparameter log-logistic dose response model. This model was constrained to fit a common upper asymptote for each individual experiment, and a positive lower asymptote. Each experiment was repeated for a total of 2-7 biological replicates, and estimated IC 50 ′s and maximal effects were combined using geometric and arithmetic means, respectively. combination testing. For synergy experiments, drugs were tested in combination using a fixed-ratio "ray" design 57 . Drugs were diluted to 10 × previously estimated IC 50 and combined in specific ratios (80%:20%, 50%:50% and 20%:80%). These combined drugs were serially diluted to maintain a constant ratio. Doseresponse curves were fit for each mixture of drugs. Data from multiple biological replicates was combined in a mixed-effects model, using the function metadrc() in the R package medrc. 58 Absolute IC 50 estimates from these models were then used to calculate the combination index CI = a/A + b/B, where A and B are doses of individual drugs that produce a specified effect, and (a, b) is the pair of doses in a combination that produces the same effect 59 . Confidence intervals of CI were estimated using confidence limits of each IC 50 estimate in the same equation. CI < 1 indicates synergy, CI = 1 no effect, and CI > 1 antagonism.
Software for machine learning. Chordoma datasets were named as Broad 21 and EGFR 20 . These datasets consist of molecules screened against multiple cell lines and the original authors provided potency or activity classifications which were used as a binary score. These datasets underwent curation to remove problematic molecules before model building as described elsewhere 10,12,[22][23][24][25][26][27][28][29] . We utilized Assay Central which has been previously described in detail 10,12,[22][23][24][25][26][27][28][29] to prepare and merge datasets collated in Molecular Notebook 60 , as well as generate Bayesian models using ECFP6 descriptors 61,62 . Briefly, the Assay Central project includes automated workflows for curating well-defined structure-activity datasets that employ a set of rules for the detection of problematic data (i.e. abnormal valences, multiple components) that can be corrected by multiple means. Data that is compatible with machine learning is then used to generate a Bayesian model for prospective bioactivity predictions. Each Bayesian model in Assay Central includes the following metrics generated from fivefold cross validation: Recall, Precision, Specificity, F1-Score, Receiver Operating Characteristic (ROC) curve, Cohen's Kappa (CK) 63,64 , and the Matthews Correlation Coefficient (MCC) 65 . Assay Central prediction workflows assign a probabilitylike score 61,62 , with values above 0.5 considered an active prediction, and an applicability score which assesses the portion of fragments overlapping with the training set molecules to the input compounds. Predictions were applied to the specific compounds of interest from NCATS (https ://web.archi ve.org/web/20170 71616 3452/https :/ncats .nih.gov/ntu/asset s/2017#Adult _Indic ation s ) that were not in the training sets (AZD0530 was in the Broad dataset and was therefore not scored by the models) prior to in vitro testing as we have done previously 25 .