Predicting efficacy of drug-carrier nanoparticle designs for cancer treatment: a machine learning-based solution

Kibria, Md Raisul; Akbar, Refo Ilmiya; Nidadavolu, Poonam; Havryliuk, Oksana; Lafond, Sébastien; Azimi, Sepinoud

doi:10.1038/s41598-023-27729-7

Download PDF

Article
Open access
Published: 11 January 2023

Predicting efficacy of drug-carrier nanoparticle designs for cancer treatment: a machine learning-based solution

Md Raisul Kibria¹,
Refo Ilmiya Akbar¹,
Poonam Nidadavolu¹,
Oksana Havryliuk¹,
Sébastien Lafond¹^na1 &
…
Sepinoud Azimi¹^na1

Scientific Reports volume 13, Article number: 547 (2023) Cite this article

2587 Accesses
1 Citations
7 Altmetric
Metrics details

Subjects

Abstract

Molecular Dynamic (MD) simulations are very effective in the discovery of nanomedicines for treating cancer, but these are computationally expensive and time-consuming. Existing studies integrating machine learning (ML) into MD simulation to enhance the process and enable efficient analysis cannot provide direct insights without the complete simulation. In this study, we present an ML-based approach for predicting the solvent accessible surface area (SASA) of a nanoparticle (NP), denoting its efficacy, from a fraction of the MD simulations data. The proposed framework uses a time series model for simulating the MD, resulting in an intermediate state, and a second model to calculate the SASA in that state. Empirically, the solution can predict the SASA value 260 timesteps ahead 7.5 times faster with a very low average error of 1956.93. We also introduce the use of an explainability technique to validate the predictions. This work can reduce the computational expense of both processing and data size greatly while providing reliable solutions for the nanomedicine design process.

In silico modelling of cancer nanomedicine, across scales and transport barriers

Article Open access 08 July 2020

Evolutionary computational platform for the automatic discovery of nanocarriers for cancer treatment

Article Open access 21 September 2021

Computationally guided high-throughput design of self-assembling drug nanoparticles

Article 25 March 2021

Introduction

Cancer is a complicated disease caused by abnormal cell growth due to genetic reasons. The severity and societal impact of the disease, along with the fact that effective therapeutics do not exist for many types of cancer, have resulted in cancer therapy being a key area of research for decades. Traditionally, the treatment of cancer has been based on chemotherapy, combination therapy, and radiation therapy, which are effective in some cases, but the toxicity introduced to other normal cells limits the use of these treatments. In contrast, nanotherapeutics provide a more targeted and less invasive alternative. This use of controlled drug delivery has several advantages, including lower dose requirements, greater control over toxicity, and bioavailability of doses^1,2,3. The active targeting of tissues is performed using special homing devices, called ligands, with functionalized drug molecules encapsulated within the particle. Apart from this, a large number of other components, such as the size, chemical structure, and delivery method, are involved in the design process of these nanodrug carriers⁴.

A typical nanoparticle (NP) consists of two or three basic layers: the surface, the shell, and the core. Each layer can vary in physicochemical properties such as the shape, size, porosity, hydrophobic properties, or element combinations⁵. As cell-binding moieties, several agents, such as carbohydrates, vitamins, peptides, and proteins, have been shown to work well. Consequently, the process of designing an NP boils down to a rich set of chemical problems with a large number of parameters to explore. Moreover, the particle efficacy is intricately connected to the chosen design specifications^6,7,8,9. This therapeutic efficacy is characterized by the delivery of the drug molecules to their target destinations, as after exposure, they may quickly dissolve before reaching the destination¹⁰. Often, different statistics derived from configurations such as the solvent accessible surface area (SASA) provide a good understanding of the efficacy and bioavailability of drugs in a certain state¹¹. The SASA is designated as the region of the molecule surface exposed enough to be able to interact with solvent molecules. Hence, the design of an NP must constitute the physicochemical properties that lead to a higher SASA value through their biological interactions¹². However, exploring the vast parameter space and identifying designs with target characteristics is a large limitation both in terms of time and cost.

A more efficient and reliable way to find a good design is to use molecular dynamics (MD) simulations. Through MD simulations, hundreds of atoms with biological relevance can be included in a design, such as entire proteins in a solution with explicit solvent representations, membrane-embedded proteins, or large macromolecular complexes such as nucleosomes or ribosomes. MD simulations allow in silico modelling of the cellular uptake and intracellular trafficking of NPs. In addition, these models provide data for monitoring NP interactions as they enter and exit a cell, which are difficult to calculate otherwise¹³. Internally, simulations make use of the forces acting on every atom. This can be obtained by deriving complex equations and deducing the potential energy from the molecular structure. However, the complex equations of MD simulations create two principal challenges¹⁴. The first challenge is to derive the potential energy for the system. There is a need for further refinement because the simulations are poorly suited to certain systems. The second challenge is the high computational demand of the simulations, which prohibits routine simulations with lengths greater than a microsecond. This leads to an inadequate sampling of conformational states¹⁵.

One way of accelerating MD simulations to take advantage of advanced hardware technologies such as graphics processing units (GPUs)^16,17,18. A GPU provides higher performance than a single CPU core in terms of increased speed and overall processor utilization. However, GPUs lack the flexibility in their hardware architectures to implement all MD simulation algorithms. Extensive rework and optimization must be applied depending on the specific algorithm to enable it to work efficiently on these specialized pieces of hardware.

The limitations of hardware architecture can be resolved using machine learning (ML) during the development of MD simulations and molecular modelling. Wang et al. reviewed the use of ML-based methods to analyse and enhance MD simulations¹⁹. The first use of ML was to analyse the high-dimensional data produced by MD simulations through the use of artificial neural networks (ANNs). Different forms of ANNs can be used to produce latent vectors in a low-dimensional feature space from trajectory data. This enables an efficient way of evaluating the equilibrium and dynamic properties of systems^{20,21,22,23,24,25,26,27,28}. Another set of studies focuses on the active involvement of ML-based techniques during the simulation process to improve the sampling time and capacity^{29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47}. However, for both objectives, model interpretability or model transferability to new systems poses a challenge. Another recent work implemented distance-based ML algorithms to simulate the atomistic interactions of a $Au_{\text {38}}(SCH_{3})_{\text {24}}$ nanocluster. The presented solution involves the use of transformation techniques to convert atomic coordinates into vectors of atomic interactions through descriptors that can be directly used with ML models. A Monte Carlo strategy was used to evaluate the energy landscape learned through the ML models and showed great results. However, the models were trained solely with $Au_{\text {38}}(SCH_{3})_{\text {24}}$ nanoclusters and focused mainly on a faster configuration space probing method. Hence, a study that can predict some target metrics for NP designs, such as the SASA value, without running MD simulations over a longer period and is generalizable to new systems holds much significance.

In this study, we propose a twofold approach. On the one hand, the issue of applicability of models to new NP designs is tackled, and on the other hand, using explainable AI provides a way to interpret the results. The proposed solution consists of three steps: transforming the data, using a hybrid ML network to predict the SASA value at a specified timestep, and using feature importance to explain and validate the results. Experimental atomic coordinate data for different NP designs are derived from MD simulations and are transformed using the many-body tensor representation (MBTR) descriptor, which reduces the data size and complexity, as well as reflecting interatomic interactions between pairs of elements. We present a combined ML system that consists of a time series model used to simulate the MD interactions over a specified period and a second deep neural network (DNN)-based model to calculate the SASA metric from the intermediate state. Feature importance is calculated using SHAP values to reflect the contribution of each element pair’s interactions. In this paper, we show that ML methods can be used to substantially reduce the cost of NP simulations and, consequently, provide an efficient assistive tool for exploring the NP design space. This work is a novel study of predicting the SASA as a representative example; however, the approach can be generalized to a wide range of other properties and different molecules as well. In addition, we introduce a way to provide explanations for the models that increases both the reliability of the model and can give insights into better NP designs.

Results

The data used in this study are snapshots from MD simulations involving NP designs functionalized with 9 different drug types (see Table 3). These snapshots were taken over a range of variable periods at a rate of one snapshot per nanosecond. Specifically, 64 NP designs were recorded over 300 ns, 32 were recorded over 200 ns and 23 were recorded over 120 ns. These snapshots contain the Cartesian coordinates of the atoms in the systems along with other information and represent how the atom movements are dictated by the environment. We first transform these data into vector encoding by extracting design-specific global properties through MBTR descriptors. As a result, the data become manageable and compressed with only ($n_{\text {features}} =$) 72 features representing each state. In order to apply ML models for the prediction of SASA values at future timesteps, the proposed solution combines two different models, each responsible for a part of the overall objective, as illustrated in the proposed workflow in Fig. 1. These are:

1.
Time series model: This model is used to learn the inherent properties from a fixed window of MBTR vectors that influence atomic interactions during the period. This learned pattern is used to forecast future MBTR vectors and used in a sliding window mechanism until the vector for the specified time is predicted. Hence, this model enables the approximation of the state of an NP at any given point in time in the future.
2.
SASA model: To calculate the SASA value by exploiting the transitive property between the atomic coordinates and the MBTR vectors, we use a second model. This model predicts the $\vert SASA^{\langle t\rangle }\vert = P(\theta \vert V^{\langle t\rangle }_{\text {MBTR}})$ value for any particular timestep, t, where $\theta$ is the learned parameter.

The data are split into training and test sets with a ratio of 80:20, which translates to 107 designs in the training set and 12 in the test set. During the splitting, the order in time for the data of each design is preserved for the models to capture the sequential properties. The range of SASA values for different designs varies greatly; hence, the test set is manually chosen to have representative samples from different ranges in the dataset. Each of the 12 designs in the test set along with the whole training data is depicted in Fig. 3a by taking the minimum and maximum SASA values over the whole period.

Time series prediction

As discussed in the “Methods” section, we experiment with two approaches for time series prediction. Both approaches process the input data based on a sliding window method, and the window size dictates how long the simulations run before a solution can be used. The first approach is a transformer model using multivariate MBTR vectors as input to predict the next timestep’s MBTR. The transformer model is used because the self-attention mechanism of the model is suitable for effectively approximating the interatomic interactions. The model achieves a mean absolute error (MAE) value of 40.16 on the test set for 3120 test samples for a fixed window size of 40. Here, we use the MAE as the error metric since it provides a linear score for deviation from the original value in a compact scale. The final MAE values are much higher than expected, which can be attributed to the smaller size of the dataset for such a large model. Hence, we use the second approach to minimize the error values with the same amount of data.

As the next method, an ensemble approach is trained using 72 separate XGBoost models⁴⁸, with each model predicting the value for the next timestep of each feature. The outputs from each model are then concatenated to produce the final vector for that timestep. The results of how different values of window size influence the outcome of the ensemble approach are presented in Table 1, and in all cases, the MAE value is comparatively much smaller and suitable for the solution. The best achieved MAE of 1.57 is for the smallest tested window size of 10.

Figure 2a shows the bar plot representation of the predictions using the ensemble approach and the transformer model for a randomly sampled test data, respectively. Figure 2b shows a detailed bar plot representation of the MAE for each model from the ensemble approach.

From the predictions, it can be seen that the XGBoost models provide better accuracy than the transformer model. From Fig. 2b, we can see that most of the features in the ensemble approach produce below average MAEs. There are 8 features that have above average MAEs, while only 4 features out of the 72 have an MAE above 10.

With these results, we used the ensemble approach as the time series predictor for the combined solution. Additionally, as this approach uses a classic ML algorithm, it is robust to smaller dataset sizes.

SASA prediction

To determine the best performing deep neural network model for this task, models with different architectures are evaluated. Keeping the number of layers and activation functions the same, we experimented with different numbers of neurons in the feedforward network. The model with 512 neurons in each hidden layer had an MAE value of 6265.85, whereas the model with 128 neurons had a higher MAE value of 6810.92. In contrast, the model with 256 neurons in each hidden layer had the best performance, with an MAE value of 936.42; hence, it is used as the base model.

Both the MBTR vectors and the SASA values of the NP designs for each timestep were stacked vertically for the training and testing datasets. Figure 3b illustrates the predicted and expected SASA values that change continuously for 300 iterations of different designs in sequential order.

The model can learn the range of SASA values for each design and how the SASA values decrease over time. The model can generalize well to new or unseen data as well. As seen in Fig. 3b, after encountering a new design every 300 iterations, the model quickly adapts to changes in the SASA.

Combined inference

As the SASA value takes an uncertain amount of time to reach a stable range, the duration for MD simulations has to be predefined to a maximum value during which all NPs are expected to reach that state. Reflecting the same property, inferences in the proposed solution can be made for a given amount of time, which is achieved by running the time series model $s_{\text {steps}}$ = $t - w_s - 1$ times, where t is the target timestep. We start the combined inference with the MBTR vectors of the initial timesteps for a fixed window size and use the proposed workflow to predict the SASA value at the 300th timestep. Different window sizes, $w_s$, are tested, the same as those for the time series model, and the results are evaluated by comparing the actual SASA value at the 300th timestep for the design and the predicted value using Eq. (1). The comparative results are demonstrated in Table 1.

$$\begin{aligned} error_{\text {SASA}} = \frac{1}{k} \times \sum _{i=1}^k \vert \hat{y}_i^{\langle \text {t}\rangle } - y_i^{\langle \text {t}\rangle } \vert \end{aligned}$$

(1)

where k refers to the number of NP designs in the test set, t is the final timestep for that design, $y_i$ is the ground-truth and $\hat{y}_i$ is the predicted value for the ith design.

Table 1 The impact of different window size values, $w_s$.

Full size table

From Table 1, it can be observed that although the MAE for the time series model is smallest in the case of a smaller window size, the best score for the combined inference is achieved with a window size of 40. Hence, we use this value for comparing the outputs for the test set designs with respect to ground-truth values acquired by MD simulations, and the results are presented in Table 2.

Table 2 Comparison between the predictions for the test set and the ground-truth values. Both the predictions and the actual values are for the 300th timestep of the test set designs.

Full size table

It can be observed from Table 2 that the predictions are very close to the SASA values achieved through running MD simulations for the whole duration. As a result, the potential of the model is large, especially considering the computing and resource expenditures of acquiring the values through MD simulations for a large number of NP designs.

Explainable AI prospects

To establish the reliability of the results, we use SHapley Additive exPlanations (SHAP)⁴⁹. It is applied to our model to obtain the importance of the atomic interactions that greatly affect the model’s output, i.e., the SASA value. From the results of the proposed approach, we can observe a strong correlation between the MBTR descriptors and the corresponding SASA values. This indicates that the interatomic distances can impact how the NP evolves.

Since the same structure from different residues may have different effects on solubility, the whole drug-carrier system is not suitable for determining feature importance . For example, Panobinostat-based and Quinolinol-based NPs have opposing properties: Panobinostat is a hydrophilic (attracted by water) drug, whereas Quinolinol is hydrophobic (repels water), which have different impacts on the resulting SASA value⁷. As the drugs have the same groups consisting of the same elements, using the relation between interatomic distances created by the MBTR and their SASA values is insufficient for explanations. For this reason, we generated MBTRs and built a separate model for each residue. In our approach, we focus on explanations for each residue to provide the pair of elements within them, which can result in a higher SASA value, as opposed to elements that are less significant.

For example, for the drug residue from Panobinostat-based NPs (Fig. 4), it can be observed that pairs of hydrogen atoms and carbon atoms are very important in terms of how steady the molecules on the surface are. The graph shows both positively and negatively affecting element-pairs. Positive interactions can lead to an increase in the SASA value, whereas negative interactions can lead to a decrease. The phenomenon of hydrogen atom pairs having such a large impact may be because the more spread the hydrogen atoms are, the greater they can create hydrogen bonds with the solvent molecules. In contrast, as the carbons exist mainly in long chains, a relatively higher distance may indicate folding, which reduces solubility.

Discussion

Due to the wide range of biochemical and physicochemical properties of NPs and the expensive in vivo testing process, computational solutions (often MD simulations) are more feasible and precise for the study of NPs in anti-cancer treatment⁵⁰. This work has been developed within an application scenario defined in the H2020 project EVO-NANO. The overall project scenario was to perform in silico NP design evaluations (MD simulations) before the synthesis of selected NPs, the evaluation of the designs via in vitro experiments using vascular microchips, and finally in vivo experiments using mouse cancer xenografts in which biodistribution, efficacy, and toxicity of the designs can be validated. Although computational methods provide a faster way to transition from the laboratories to the clinical field, they have the bottleneck of high computational resource and time requirements that limit the experimental possibilities. The work presented in this paper focuses on the in silico step and proposes an approach to accelerate the evaluation of NP designs by predicting the stable state without the need to execute complete MD simulations.

The most significant contribution of this work is that it addresses the limitations of MD simulations and provides a scalable solution. It presents the opportunity of eliminating NP designs that do not possess the expected properties from the large pool of designs. As a result, a selective number of drug-carrier systems can be chosen with the largest efficacy values for further assessment. It takes several days to complete an NP simulation over 300 ns using high-performance computing resources, while the approach discussed in this research takes less than ten minutes to complete, starting from the input batch. Hence, if $w_s = 40$ is used, the time gain is approximately 7.5 times (300ns simulation time / 40ns simulation time) for a simulation period of 300ns. The cost of the computation can also be solved since the trained model can be used to predict the stable state of the NP design within a very short amount of time, while the simulation steps are adjustable. Real case studies on the use of automated learning method-based prescreening processes have already shown to be feasible and accurate⁵¹, whereas the target variable, SASA, has been observed to be effective for comparative analysis between different configurations of NPs⁷. In addition, this approach can be adapted to other related applications where certain properties must be monitored, such as hydrophobic/hydrophilic properties⁵².

In drug discovery, explaining decisions made through ML models is crucial, especially based on the impact. Some of the most important properties of such explanations are transparency—to understand the rationale behind the predictions, justification—the reasoning behind the acceptance of the outcomes, and informativeness⁵³. An explainable outcome not only establishes the credibility of the results through validation of what is expected but can also be used in the reverse way to find any association between the molecular structure and the physicochemical properties. We use local explainability techniques and demonstrate feature importance for a subset of the problem to achieve transparency. The effect on the target property for relative interatomic distances may not be directly applicable in the design process, but it can be used to establish new insights into the relationship between molecular structure and the target property. Moreover, information can be expanded by breaking down the problem into finer pieces and observing the model’s behaviour from every perspective.

A limitation of this work is the limited availability of the training data. Having varied data with different SASA ranges can enhance the model performance. Currently, the model has been trained with 107 different designs, and having exposure to new designs can help the model generalize more. Another limitation is the use of the MBTR descriptor, which encodes the whole NP structure into a simpler form at the cost of information loss. In the future, instead of working with a single descriptor, implementing a combination of different descriptors can help summarize the complex structure in a concise form without losing any properties of the NPs. Additionally, we have explored explainability in this work in a limited scope and demonstrated that the potential of such techniques in this area is very large. However, the relative distances between atoms are not configurable; hence, they cannot be translated to design decisions. As a future recommendation, explanations can be expanded in a way that every structure from an NP design can be thoroughly assessed and can directly influence the design decisions. This can be achieved by extracting a hierarchy of properties, for instance, the ratio of drug to background molecules, the number of residues, and the size of the NP and the core, and evaluating the target characteristics against those.

Methods

In this section, we discuss the data used for this study, the transformation technique, and the proposed models in detail. This study did not require ethical approval.

Data description

The data we use in the project are derived from MD simulations which are generated using AMBER19 software⁵⁴. In these simulations, the initial energy of the systems was minimized, and then the temperature was increased to 300 K. The MD simulations were run for one NP design at a time and stored in PDB format, which is a standard for files containing atomic coordinates. A PDB file contains information about elements used in the system, atomic coordinates in (x, y, z) format, and residue names. A simulation was run for some predefined time, which in this case was 300, 200, and 120 ns. When the MD simulations for a particular NP design were being run, the PDB files were extracted at 1 ns intervals. An example of simulation states in the beginning, middle, and end of the simulation is shown for a Panobinostat drug-based NP design in Fig. 5.

A gold (Au) core is used in each of the systems, as it provides a low toxicity level and inertness and is easy to produce. The systems are designed with one of 9 different drug types, which can be classified either as hydrophobic or hydrophilic with respect to each other. These NPs are functionalized through ligands such as polyethylene glycol, dimethylamino, and amino groups. The systems contain 6 or 7 unique elements, including Au, S, H, C, O, and N, and can additionally contain F or Cl. Apart from the drug molecules, other residues are used in combinations of 5–7 different types per NP. The drug-forming residues are described in Table 3.

Table 3 Description of the drug-forming residues.

Full size table

A comprehensive discussion on how the NPs were designed for this experiment along with how the simulations were conducted is presented in the study by Kovacevic et al.⁵⁰ For calculating the ground-truth total SASA values of the corresponding timesteps for each of the NP states represented by the PDB files, Visual Molecular Dynamics (VMD) program was used⁵⁶.

Transforming the data using descriptors

To make the data suitable for application to an ML algorithm while keeping the representations computationally inexpensive and robust to rotations, permutations, and translations, we use MBTR descriptors. An MBTR is a global descriptor that provides a unique representation for any single configuration⁵⁷. Each system is divided into contributions from different element pairs and described using relative structural attributes. In this work, to extract a single value conforming to a particular configuration of k atoms, we use an inverse distance-based geometric function, $g_2$, as in Eq. (2). The structure is then represented by constructing a distribution, $P_2$, of the scalar values using kernel density estimation with a Gaussian kernel. The theoretical underpinnings of the descriptor are expressed in Eq. (3).

$$\begin{aligned} g_2(R_l, R_m)= & {} \frac{1}{\vert {R_l - R_m} \vert } \end{aligned}$$

(2)

$$\begin{aligned} {P_2}^{l, m}(x)= & {} \frac{1}{{\sigma _2 \sqrt{2\pi }}} e^{\frac{({x-g_2(R_l, R_m)})^2}{2\sigma _2^2} } \end{aligned}$$

(3)

where $R_l$ and $R_m$, refer to the Cartesian coordinates of atoms l and m, respectively, and $g_2$ is derived from the reciprocal of their Euclidean distances. As the distributions are calculated for a set of predefined values of x and standard deviation $\sigma _2$, each possible pair of the k-species present has multiple such values. These are combined into a singular value by taking the weighted average for each of these pairs, as expressed in Eq. (4).

$$\begin{aligned} {MBTR}_2^{Z_1, Z_2}(x) = \sum _{l}^{\vert Z_1\vert } \sum _{m}^{\vert Z_2 \vert } w_2^{l, m} \times {P_2}^{l, m}(x) \end{aligned}$$

(4)

where $Z_1$ and $Z_2$ are the atomic numbers for atoms l and m, respectively, and $w_2$ is the weighting function.

We use the DScribe implementation of the originally proposed method⁵⁸. The exponential weighting function $(w_2 = e^{-sx})$ is used to keep the distributions tightly limited to atoms that reside in the neighbourhood. For that, a cut-off threshold of $1\times 10^{-2}$ and a scaling parameter of 0.75 are used⁸. A key parameter of the implementation, $n_{grid}$, refers to the number of discretization points and, in turn, determines the total number of features in the resulting vectors through Eq. (5). To determine its optimal value, we observe the correlation between the resulting vectors, $\text {MBTR}_{n_{\text {grid}}}$, for different $n_{\text {grid}}$ and the corresponding SASA values according to Eq. (6). These correlation scores are presented in Table 4.

$$\begin{aligned} n_{\text {features}} = \frac{n_{\text {elements}} \times (n_{\text {elements}} + 1)}{2} \times n_{\text {grid}} \end{aligned}$$

(5)

where $n_{\text {elements}}$ is the number of total elements encountered throughout the descriptor generation process; here, $n_{\text {elements}}$ = 8.

$$\begin{aligned} C_2 = \sum _{j=1}^{n} \left| \sum _{i=1}^{k} Corr({\text {MBTR}_{n_{\text {grid}}}}^{\langle i\rangle }, \text {SASA}) \right| \end{aligned}$$

(6)

where, k is the number of features and n is the number of samples used for the evaluation of $C_2$.

Table 4 Correlation to SASA for different values of $n_{\text {grid}}$.

Full size table

From Table 4, we can observe that the correlation scores do not vary much for different values of $n_{\text {grid}}$. However, as the lowest possible value of 2 for the parameter achieves the highest score while producing the smallest representation, it is chosen for this work.

Time series model

For the time series model, we use two approaches: the first is based on a transformer model, while the second approach implements an ensemble of XGBoost models.

Transformer model

A transformer is a model architecture whose structure combines an encoder and decoder. For this work, we use the encoder part of the model taking a batch of data with a fixed window size as input and outputting the multivariate vector of the MBTR corresponding to the next timestep. The architecture of the model is illustrated in Fig. 6a.

In this work, a multi-head attention mechanism is used with 12 heads, the size of each attention head is 256, and the dropout probability is 0.25. The normalization layer uses $\varepsilon = 1 \times 10^{-6}$ to normalize the input. The feedforward layer consists of a normalization layer, a 1-D convolutional layer, a dropout layer and another 1-D convolutional layer. The normalization layer and the dropout layer inside the feedforward layer use the same $1 \times 10^{-6}$ and 0.25 for the $\varepsilon$ and dropout probability, respectively. The first convolutional layer uses a ReLU activation layer with a kernel size of 1 and filters it into 4 outputs. The second convolutional layer also uses a kernel size of 1 and provides 1 output.

The model is trained by taking a window, $w_s$, and all the features, $n_{\text {features}}$, from each design in the training set and then combining them to predict the next $n_{\text {features}}$-length vector at the next timestep. For instance, providing the MBTR representing the first 40 timesteps of the MBTR as input will produce the MBTR for the 41st timestep by evaluating the learned pattern from the training dataset. This model takes 1378.5 s for training on a Tesla P100 PCIe 16 GB GPU with 28 2.4 GHz Intel Broadwell CPU cores and 230 GB of RAM.

Ensemble model

The second approach is described as an ensemble approach with an XGBoost regressor, by creating one model for each feature. The model works by training a window, $w_s$, of each feature to predict the next timestep’s value for the respective feature. The difference from the previous approach is that one feature of each design is taken to learn the pattern from it instead of taking the whole $n_{\text {features}}$ as input. As a result, it provides better predictability of the MBTR. Moreover, on the same hardware as the transformer model, the training time of this approach is 20.73 times faster. The architecture of this model is shown in Fig. 6b.

For instance, providing the MBTRs representing the first 40 timesteps as input, the first model of the ensemble approach only predicts the value for the first feature. The function then iterates through the other features, and for each feature, the corresponding model predicts the value for the next timestep. Finally, all predicted results are combined into one MBTR vector for the target timestep.

SASA model

A limitation of using the MBTR is that the encoded data cannot be reverted to atomic coordinates. Therefore, it is not possible to calculate SASA values from the MBTR directly. However, as ML has the potential to identify and understand hidden relationships, we use a feedforward neural network to predict the continuous values of the SASA from the encoded data. The MBTR as the input data represents the state of the NP at one timestep. The training and testing datasets are divided in the same way as the time series model.

The proposed network consists of 4 dense layers: (i) an input layer with 256 neurons and ReLU as the activation function, accepts 72 MBTR features; (ii) 3 hidden layers, each with 256 neurons and ReLU as the activation function; and (iii) an output layer using a linear activation function on a single neuron suitable for the regression task. For training, the model iteratively passes over the whole training set 500 times, with a batch size of 32, and optimizes using the Adam algorithm at a learning rate of 0.0001. The resulting value represents the predicted SASA. The performance of this regression model is evaluated using the MAE error metric to evaluate how close the predictions are to the expected values in either direction. The architecture of the model is shown in Fig. 6c.

Data availability

The transformed data, MBTRs for all the NP designs used in this experiment are available at: https://github.com/Evonano-Team/evonano-ml/tree/master/data/processed. PDB files of the NP designs can be provided from the authors on reasonable request.

Code availability

Code used in this project is available at: https://github.com/Evonano-Team/evonano-ml.

References

Piktel, E. et al. Recent insights in nanotechnology-based drugs and formulations designed for effective anti-cancer therapy. J. Nanobiotechnol. 14, 39. https://doi.org/10.1186/s12951-016-0193-x (2016).
Article CAS Google Scholar
Chidambaram, M., Manavalan, R. & Kathiresan, K. Nanotherapeutics to overcome conventional cancer chemotherapy limitations. J. Pharm. Pharm. Sci. 14, 67. https://doi.org/10.18433/J30C7D (2011).
Article Google Scholar
Stillman, N. R. et al. Evolutionary computational platform for the automatic discovery of nanocarriers for cancer treatment. npj Comput. Mater. 7, 150. https://doi.org/10.1038/s41524-021-00614-5 (2021).
Article ADS CAS Google Scholar
Pearce, A. K. & O’Reilly, R. K. Insights into active targeting of nanoparticles in drug delivery: Advances in clinical studies and design considerations for cancer nanomedicine. Bioconjugate Chem. 30, 2300–2311. https://doi.org/10.1021/acs.bioconjchem.9b00456 (2019).
Article CAS Google Scholar
Khan, I., Saeed, K. & Khan, I. Nanoparticles: Properties, applications and toxicities. Arab. J. Chem. 12, 908–931. https://doi.org/10.1016/j.arabjc.2017.05.011 (2019).
Article CAS Google Scholar
Truong, N. P., Whittaker, M. R., Mak, C. W. & Davis, T. P. The importance of nanoparticle shape in cancer drug delivery. Expert Opin. Drug Deliv. 12, 129–142. https://doi.org/10.1517/17425247.2014.950564 (2015).
Article CAS Google Scholar
Kovacevic, M., Balaz, I., Marson, D., Laurini, E. & Jovic, B. Mixed-monolayer functionalized gold nanoparticles for cancer treatment: Atomistic molecular dynamics simulations study. Biosystems 202, 104354. https://doi.org/10.1016/j.biosystems.2021.104354 (2021).
Article CAS Google Scholar
Pihlajamäki, A. et al. Monte Carlo simulations of Au$_{38}$ (SCH$_{3}$ )$_{24 }$ nanocluster using distance-based machine learning methods. J. Phys. Chem. A 124, 4827–4836. https://doi.org/10.1021/acs.jpca.0c01512 (2020).
Article CAS Google Scholar
Blanco, E., Shen, H. & Ferrari, M. Principles of nanoparticle design for overcoming biological barriers to drug delivery. Nat. Biotechnol. 33, 941–951. https://doi.org/10.1038/nbt.3330 (2015).
Article CAS Google Scholar
Morshed, M. & Chowdhury, E. H. Gene delivery and clinical applications. In Encyclopedia of Biomedical Engineering (ed. Narayan, R.) 345–351 (Elsevier, 2019). https://doi.org/10.1016/B978-0-12-801238-3.99883-0.
Chapter Google Scholar
Weiser, J., Weiser, A. A., Shenkin, P. S. & Still, W. C. Neighbor-list reduction: Optimization for computation of molecular van der Waals and solvent-accessible surface areas. J. Comput. Chem. 19, 797–808 (1998).
Article CAS Google Scholar
Aggarwal, P., Hall, J. B., McLeland, C. B., Dobrovolskaia, M. A. & McNeil, S. E. Nanoparticle interaction with plasma proteins as it relates to particle biodistribution, biocompatibility and therapeutic efficacy. Adv. Drug Deliv. Rev. 61, 428–437. https://doi.org/10.1016/j.addr.2009.03.009 (2009).
Article CAS Google Scholar
Stillman, N. R., Kovacevic, M., Balaz, I. & Hauert, S. In silico modelling of cancer nanomedicine, across scales and transport barriers. npj Comput. Mater. 6, 92. https://doi.org/10.1038/s41524-020-00366-8 (2020).
Article ADS CAS Google Scholar
Durrant, J. D. & McCammon, J. A. Molecular dynamics simulations and drug discovery. BMC Biol. 9, 71. https://doi.org/10.1186/1741-7007-9-71 (2011).
Article CAS Google Scholar
Hospital, A., Goñi, J. R., Orozco, M. & Gelpí, J. L. Molecular dynamics simulations: Advances and applications. Adv. Appl. Bioinform. Chem. AABC 8, 37. https://doi.org/10.2147/AABC.S70333 (2015).
Article Google Scholar
Friedrichs, M. S. et al. Accelerating molecular dynamic simulation on graphics processing units. J. Comput. Chem. 30, 864–872 (2009).
Article CAS Google Scholar
Stone, J. E. et al. Accelerating molecular modeling applications with graphics processors. J. Comput. Chem. 28, 2618–2640 (2007).
Article CAS Google Scholar
Adjoua, O. et al. Tinker-hp: Accelerating molecular dynamics simulations of large complex systems with advanced point dipole polarizable force fields using gpus and multi-gpu systems. J. Chem. Theory Comput. 17, 2034–2053 (2021).
Article CAS Google Scholar
Wang, Y., Lamim Ribeiro, J. M. & Tiwary, P. Machine learning approaches for analyzing and enhancing molecular dynamics simulations. Curr. Opin. Struct. Biol. 61, 139–145. https://doi.org/10.1016/j.sbi.2019.12.016 (2020).
Article CAS Google Scholar
Ma, A. & Dinner, A. R. Automatic method for identifying reaction coordinates in complex systems. J. Phys. Chem. B 109, 6769–6779 (2005).
Article CAS Google Scholar
Jung, H., Covino, R. & Hummer, G. Artificial intelligence assists discovery of reaction coordinates and mechanisms from molecular dynamics simulations. arXiv preprintarXiv:1901.04595 (2019).
Noé, F. & Nuske, F. A variational approach to modeling slow processes in stochastic dynamical systems. Multiscale Model. Simul. 11, 635–655 (2013).
Article MathSciNet MATH Google Scholar
Nuske, F., Keller, B. G., Pérez-Hernández, G., Mey, A. S. & Noé, F. Variational approach to molecular kinetics. J. Chem. Theory Comput. 10, 1739–1752 (2014).
Article Google Scholar
Mardt, A., Pasquali, L., Wu, H. & Noé, F. Vampnets for deep learning of molecular kinetics. Nat. Commun. 9, 1–11 (2018).
ADS Google Scholar
Lemke, T. & Peter, C. Encodermap: Dimensionality reduction and generation of molecule conformations. J. Chem. Theory Comput. 15, 1209–1215 (2019).
Article Google Scholar
Olsson, S. & Noé, F. Dynamic graphical models of molecular kinetics. Proc. Natl. Acad. Sci. 116, 15001–15006 (2019).
Article ADS CAS Google Scholar
Brandt, S., Sittel, F., Ernst, M. & Stock, G. Machine learning of biomolecular reaction coordinates. J. Phys. Chem. Lett. 9, 2144–2150 (2018).
Article CAS Google Scholar
Hernández, C. X., Wayment-Steele, H. K., Sultan, M. M., Husic, B. E. & Pande, V. S. Variational encoding of complex dynamics. Phys. Rev. E 97, 062412 (2018).
Article ADS Google Scholar
Torrie, G. & Valleau, J. Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 23, 187–199. https://doi.org/10.1016/0021-9991(77)90121-8 (1977).
Article ADS Google Scholar
Coifman, R. R. et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. Proc. Natl. Acad. Sci. 102, 7426–7431. https://doi.org/10.1073/pnas.0500334102 (2005).
Article ADS CAS MATH Google Scholar
Valsson, O. & Parrinello, M. A variational approach to enhanced sampling and free energy calculations. Phys. Rev. Lett. 113, 090601. https://doi.org/10.1103/PhysRevLett.113.090601. ArXiv:1407.0477 [cond-mat, physics:physics] (2014).
Preto, J. & Clementi, C. Fast recovery of free energy landscapes via diffusion-map-directed molecular dynamics. Phys. Chem. Chem. Phys. 16, 19181–19191. https://doi.org/10.1039/C3CP54520B (2014).
Article CAS Google Scholar
Kingma, D. P. & Dhariwal, P. Glow: Generative flow with invertible $1 \times 1$ convolutions. Adv. Neural Inf. Process. Syst. 31 (2018).
Dixit, P. D., Jain, A., Stock, G. & Dill, K. A. Inferring transition rates of networks from populations in continuous-time Markov processes. J. Chem. Theory Comput. 11, 5464–5472. https://doi.org/10.1021/acs.jctc.5b00537 (2015).
Article CAS Google Scholar
Tiwary, P. & Parrinello, M. A time-independent free energy estimator for metadynamics. J. Phys. Chem. B 119, 736–742. https://doi.org/10.1021/jp504920s (2015).
Article CAS Google Scholar
Tiwary, P. & Berne, B. J. Spectral gap optimization of order parameters for sampling complex molecular systems. Proc. Natl. Acad. Sci. 113, 2839–2844. https://doi.org/10.1073/pnas.1600917113 (2016).
Article ADS CAS Google Scholar
Valsson, O., Tiwary, P. & Parrinello, M. Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint. Annu. Rev. Phys. Chem. 67, 159–184. https://doi.org/10.1146/annurev-physchem-040215-112229 (2016).
Article ADS CAS Google Scholar
Wetzel, S. J. Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders. Phys. Rev. E 96, 022140. https://doi.org/10.1103/PhysRevE.96.022140 (2017).
Article ADS Google Scholar
Dinh, L., Sohl-Dickstein, J. & Bengio, S. Density estimation using Real NVP. arXiv:1605.08803 [cs, stat] (2017).
Chiavazzo, E. et al. Intrinsic map dynamics exploration for uncharted effective free-energy landscapes. Proc. Natl. Acad. Sci. 114. https://doi.org/10.1073/pnas.1621481114 (2017).
Zhang, J. & Chen, M. Unfolding hidden barriers by active enhanced sampling. Phys. Rev. Lett. 121, 010601. https://doi.org/10.1103/PhysRevLett.121.010601. arXiv:1705.07414 [cond-mat, physics:physics] (2018).
Ribeiro, J. M. L., Collado, P. B., Wang, Y. & Tiwary, P. Reweighted autoencoded variational bayes for enhanced sampling (RAVE). arXiv:1802.03420 [cond-mat, physics:physics] (2018).
Wu, H., Mardt, A., Pasquali, L. & Noe, F. Deep generative Markov state models. In Advances in Neural Information Processing Systems Vol. 31 (eds Bengio, S. et al.) (Curran Associates, Inc, 2018).
Google Scholar
Smith, Z., Pramanik, D., Tsai, S.-T. & Tiwary, P. Multi-dimensional spectral gap optimization of order parameters (SGOOP) through conditional probability factorization. J. Chem. Phys. 149, 234105. https://doi.org/10.1063/1.5064856 (2018).
Article ADS CAS Google Scholar
Shamsi, Z., Cheng, K. J. & Shukla, D. Reinforcement learning based adaptive sampling: REAPing rewards by exploring protein conformational landscapes. J. Phys. Chem. B 122, 8386–8395. https://doi.org/10.1021/acs.jpcb.8b06521 (2018).
Article CAS Google Scholar
Bonati, L., Zhang, Y.-Y. & Parrinello, M. Neural networks-based variationally enhanced sampling. Proc. Natl. Acad. Sci. 116, 17641–17647. https://doi.org/10.1073/pnas.1907975116 (2019).
Article ADS MathSciNet CAS MATH Google Scholar
Noé, F., Olsson, S., Köhler, J. & Wu, H. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. Science 365, eaaw1147. https://doi.org/10.1126/science.aaw1147 (2019).
Article ADS CAS Google Scholar
Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794 (2016).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017).
Kovacevic, M. & Balaz, I. The Role of Molecular Dynamics Simulations in Multiscale Modeling of Nanocarriers for Cancer Treatment, 209–235 (Springer International Publishing, 2022).
Google Scholar
Zhang, H. et al. An integrated deep learning and molecular dynamics simulation-based screening pipeline identifies inhibitors of a new cancer drug target tipe2. Front. Pharmacol. 12. https://doi.org/10.3389/fphar.2021.772296 (2021).
Ghosh, T., García, A. E. & Garde, S. Molecular dynamics simulations of pressure effects on hydrophobic interactions. J. Am. Chem. Soc. 123, 10997–11003. https://doi.org/10.1021/ja010446v (2001).
Article CAS Google Scholar
Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).
Article Google Scholar
Case, D. et al. Amber 2019 (University of California, 2019).
Google Scholar
Pettersen, E. F. et al. Ucsf chimerax: Structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).
Article CAS Google Scholar
Humphrey, W., Dalke, A. & Schulten, K. VMD—Visual molecular dynamics. J. Mol. Graph. 14, 33–38 (1996).
Article CAS Google Scholar
Huo, H. & Rupp, M. Unified representation of molecules and crystals for machine learning. arXiv:1704.06439 [cond-mat, physics:physics] (2018).
Himanen, L. et al. DScribe: Library of descriptors for machine learning in materials science. Comput. Phys. Commun. 247, 106949. https://doi.org/10.1016/j.cpc.2019.106949 (2020).
Article CAS Google Scholar
Vaswani, A. et al. Attention is all you need. arXiv:1706.03762 [cs] (2017).

Download references

Acknowledgements

The work has been partially supported by the H2020 project EVO-NANO (European Union’s Horizon 2020 research and innovation programme grant agreement No. 800983) and the EMJMD master’s programme in Engineering of Data-Intensive Intelligent Software Systems (EDISS—European Union’s Education, Audiovisual and Culture Executive Agency grant number 619819). We would like to express our gratitude to Marina Kovacevic at the University of Novi Sad, Otto Lindfors, and Victor-Bogdan Popescu at Åbo Akademi University for their help with several concepts. Additionally, we are thankful to CSC—IT Center for Science, Finland, for the computational resources.

Author information

These authors contributed equally: Sébastien Lafond and Sepinoud Azimi.

Authors and Affiliations

Faculty of Science and Engineering, Åbo Akademi University, 20500, Turku, Finland
Md Raisul Kibria, Refo Ilmiya Akbar, Poonam Nidadavolu, Oksana Havryliuk, Sébastien Lafond & Sepinoud Azimi

Authors

Md Raisul Kibria
View author publications
You can also search for this author in PubMed Google Scholar
Refo Ilmiya Akbar
View author publications
You can also search for this author in PubMed Google Scholar
Poonam Nidadavolu
View author publications
You can also search for this author in PubMed Google Scholar
Oksana Havryliuk
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien Lafond
View author publications
You can also search for this author in PubMed Google Scholar
Sepinoud Azimi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All the authors developed the initial concepts together and contributed to the planning of the methodology. P.N. and O.H. analyzed the parameters for data transformation, and R.I.A. and M.R.K. conducted the transformation for all the data. M.R.K., R.I.A., P.N., and O.H. contributed equally in conducting several experiments. M.R.K. wrote scripts for the final data pipeline and model training, and O.H., R.I.A., and P.N. contributed with the hyperparameter tuning. S.L. and S.A. jointly supervised the project. All the authors designed and drafted the manuscript together and approved the final version.

Corresponding author

Correspondence to Md Raisul Kibria.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kibria, M., Akbar, R.I., Nidadavolu, P. et al. Predicting efficacy of drug-carrier nanoparticle designs for cancer treatment: a machine learning-based solution. Sci Rep 13, 547 (2023). https://doi.org/10.1038/s41598-023-27729-7

Download citation

Received: 03 August 2022
Accepted: 06 January 2023
Published: 11 January 2023
DOI: https://doi.org/10.1038/s41598-023-27729-7

This article is cited by

Point biserial correlation symbiotic organism search nanoengineering based drug delivery for tumor diagnosis
- Garima Shukla
- Sofia Singh
- Ben Othman Soufiene
Scientific Reports (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

In silico modelling of cancer nanomedicine, across scales and transport barriers

Evolutionary computational platform for the automatic discovery of nanocarriers for cancer treatment

Computationally guided high-throughput design of self-assembling drug nanoparticles

Introduction

Results

Time series prediction

SASA prediction

Combined inference

Explainable AI prospects

Discussion

Methods

Data description

Transforming the data using descriptors

Time series model

Transformer model

Ensemble model

SASA model

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Point biserial correlation symbiotic organism search nanoengineering based drug delivery for tumor diagnosis

Comments

Search

Quick links