Machine learning on properties of multiscale multisource hydroxyapatite nanoparticles datasets with different morphologies and sizes

Machine learning models for exploring structure-property relation for hydroxyapatite nanoparticles (HANPs) are still lacking. A multiscale multisource dataset is presented, including both experimental data (TEM/SEM, XRD/crystallinity, ROS, anti-tumor effects, and zeta potential) and computation results (containing 41,976 data samples with up to 9768 atoms) of nanoparticles with different sizes and morphologies at density functional theory (DFT), semi-empirical DFTB, and force field, respectively. Three geometric descriptors are set for the explainable machine learning methods to predict surface energies and surface stress of HANPs with satisfactory performance. To avoid the pre-determination of features, we also developed a predictive deep learning model within the framework of graph convolution neural network with good generalizability. Energies with DFT accuracy are achievable for large-sized nanoparticles from the learned correlations and scale functions for mapping different theoretical levels and particle sizes. The simulated XRD spectra and crystallinity values are in good agreement with experiments.


INTRODUCTION
Hydroxyapatite, Ca 10 (PO 4 ) 6 (OH) 2 , is a kind of biomaterial widely distributed in human bone and teeth. Recently, hydroxyapatite nanoparticles (HANPs) have attracted intensive interest due to their biocompatibility and biological activity [1][2][3] . HANPs could inhibit the proliferation of various tumor cells, such as hepatoma cells, osteosarcoma cells, lung cancer cells, and gastric cancer cells to some extent 4,5 . More importantly, it was non-toxic to the normal tissue cells. Some research groups and co-authors of the present work had fabricated various HANPs with different morphology, size, and crystallinity, for which the in vitro and in vivo anti-melanoma effect was systematically characterized 6,7 . By collecting these experiment data, including TEM/SEM, XRD/ crystallinity (Supplementary Table S1), reactive oxygen species  (ROS, Supplementary Table S2), anti-tumor effects (Supplementary Data, Source data are provided as a Source data file as shown in Data availability), and zeta potential (Supplementary Table S3), we found that the morphologies and physiochemical properties of HANPs played important roles in its anti-tumor ability (Supplementary Note 1 and Note 2). A high concentration of HANPs treatment in melanoma-bearing nude mice showed a strong inhibitory effect on tumor size and weight. Thus, HANPs hold the promise of acting as an excellent and safe biomaterial used in antitumor treatment.
Understanding the structure-property relation is of great importance in the rational design of HANPs with the desired structure and property. High throughput computations are highly desired to build the relationship between the different morphologies and properties of HANPs such as electrostatic potential surface (Supplementary Table S3), protein binding energy (Supplementary Table S4), surface energy, and surface stress (Supplementary Table S5). One of the important properties of nanoparticles is surface energy, which plays a crucial role in nucleation and growth [7][8][9][10][11][12][13][14][15] . Application of machine learning (ML) to rational nanomaterial design is gaining intensive attraction. In this work, we attempt to explore the correlation between the surface energy and the geometric features such as the ratio of length/diameter (L/D), the distance between the outer cap surface and HANP center (d cap ), and the number of Ca 2+ coordination or contact (N ca-contact ) of a large number of HANPs with different sizes and shapes through high throughput computations and machine learning techniques, as shown in Fig. 1.
Some popular explainable machine learning methods such as LightGBM, XGBoost, Support Vector Machine, GDRT, etc., were applied to predict the surface energies and surface stress. We also developed graph convolution neural network model to learn features for the molecular graphs, to predict surface energy more accurately, and to give a workflow and the learned scale function (λ) to get surface energies of large-sized HANPs with DFT accuracy. To better correlate with the experiment, image segmentation is used on the experimental TEM image to obtain the geometry parameters of different morphologies of HANPs for 3D HANPs model reconstruction. With these multi-source (experimental and computational) data and ML models, XRD patterns and crystallinities were simulated. The simulated results are in good agreement with the experimental ones. The survey of electrostatics potential surface (EPS) maps for different nanoparticles give a hint for predicting the nucleation tendency and binding abilities of nanoparticles with proteins, shedding insights into the under-

Construction of datasets
The present HANPs dataset contains both experimental data and theoretical data, as shown in Supplementary Table S5. Experimental data were collected from the published literatures with reference list given in Supplementary Note 1 (Supplementary  Tables S1-S5, Supplementary Data). In comparison with the other well-organized nanoparticle datasets of enzyme binding energy, cellular uptake potentials by HEK293/A549 cells, ROS, logP, and zeta potentials in water/phosphate buffer, and catalytic properties of various nanoparticles [16][17][18] , the HANPs dataset with a collection of various properties is rarely reported. The reported experimental data of HANPs are rather complicated with different crystal morphologies, surfaces, charges, and biological activity tested in different animals and cells in complex condition 4,6,11 . In addition, the dataset size is very limited. It is hence difficult to build datahungry machine learning models to directly predict properties like those done for other nanoparticles using either a set of nanoparticle descriptors or deep neural network (DNN) with many neurons in the layer 18 . Here we analyzed the experimental ROS and anti-tumor data using both Bayesian network and Apriori algorithm (Supplementary Note 2, Supplementary Fig. S1, Supplementary Tables S6 and S7). It is found that morphology of HANPs, sphere versus rod, is an important factor in both ROS indicator and in vitro inhibition rate.
Therefore, in the following subsections, we attempt to explore the correlation between the morphology and various properties such as surface energies and electrostatic potential. To achieve this target, the computational data of HANPs with different morphologies and sizes are produced by concurrence high throughput calculations at different theoretical levels of DFT (PBE/DND.44), semi-empirical DFTB, and force field (CVFF), respectively.
As shown in Fig. 2, HANPs with different morphologies such as sphere (S), rod (R), and needle (N), were generated automatically to satisfy the relationship between the crystal index and facets (Supplementary Note 3: Supplementary Fig. S2). To keep the generated HANPs neutral, some surface atoms were deleted randomly and automatically by using a homemade code. The random number was set for each fragment, namely, calcium ion, hydroxyl, and phosphorus oxygen tetrahedron of the HANPs structures. The excess parts were automatically deleted to meet the stoichiometry. Concurrent computation is applied to accelerate the data set construction process ( Table S10); DFT versus DFTB; different force fields and surface models (Supplementary Tables S11 and S12). The details of the whole HANPs datasets are described in Supplementary Note 6, containing file format and dataset architecture ( Supplementary  Fig. S7), selected HANPs of the DFT dataset (Supplementary Table  S13, Supplementary Fig. S8), some samples of the DFTB dataset (Supplementary Table S14

Explainable machine learning model
It is of great importance to introduce explainable machine learning methods to understand the relationship between HANPs structure and properties. To describe the morphology differences of these HANPs, 6 feature descriptors, including ratio of length/ diameter (L/D), distance between HANP center and (1 0 1) surface cap, d cap , number of atoms, N atom , SASA, SASA per number of supercell N supercell , and number of Ca 2+ coordination or contact within distance threshold N Ca-contact , were selected as the input data (Fig. 3a). In the DFT database, 4033 configurations of HAP nanoparticles were used for machine learning models including MLP, SVR, XGBoost, LightGBM, GBRT, and Random Forest (RF) methods [19][20][21][22][23][24] . Metrics of mean absolute errors (MAE), root mean square errors (RMSE), and coefficient of determination (R 2 ) were used to evaluate the model performance. All data samples were split into 8:1:1 ratio as the training, validation, and test set. The descriptors selection process together with the 'last elimination method' 25   Tables S15-S18, Supplementary Fig. S14), from which 6-feature and 3-feature schemes were used to give good predictions of surface energies. Feature importance was obtained based on the average of multiple results on XGBoost, lightGBM, GBRT, and RF from the 6-feature and 3-feature scheme experiments, respectively (Supplementary Tables S19 and S20). The feature importance for each descriptor and impressive performance of different machine learning methods are displayed in Fig. 3. The most important three features are L/D, d cap , and N ca-contact , indicating that the shape characteristics are closely correlated to the surface stabilities. Since the surface energy is defined as the energy difference between the HANPs and the bulk divided by the To treat larger sized systems, we resorted to the semi-empirical DFTB calculations. The DFTB dataset used for training contains 1850 optimized structures, which are classified into two groups: meta-stable nanoparticles with its surface energy over 2 J/m 2 and low energy nanoparticles of <2 J/m 2 . As shown in Fig. 4a, nanoparticles with higher surface energy have low calcium coordination number (<4) and nanoparticles with low surface energy have coordination number larger than 5.6. Specifically, nanoparticles of needle shape (L/D > 5) with Ca coordination number of 4-6 usually have low surface energy. With the target of synthesizing stable nanoparticles with relatively low surface energy, we set a dataset with criteria of calcium coordination number >6 and L/D > 5. Descriptor selection was performed systematically, as shown in Supplementary Table S11. The importance of SASA is indicated by the close relation with the number of surface Ca 2+ , called N surface_ca . The increase in SASA and N surface_ca. corresponds to unstable surface with the increased surface energy (Fig. 4b, Supplementary Fig. S15). DFTB: 0.64 J/m 2 ) of (0 0 1) surface of HAP crystal. It will be shown that the (0 0 1) surface is crucial in HANPs not only because it has the lowest surface energy but also its XRD pattern is especially important for the crystallinity measurement.
However, the configuration spaces obtained from DFT and DFTB calculations are not large enough to cover all kinds of shapes. Only with the number of atoms increasing up to 9000, the needleshaped nanoparticles can be formed with L/D up to 20. For such large-sized systems, force field method like CVFF was employed to yield 5940 optimized structures. It should be mentioned that surface energies predicted by DFT dataset cannot be directly compared with the CVFF dataset, since the energy of forcefield is the strain energy relative to the 'natural' parameters rather than the electronic energy in DFT ( Supplementary Fig. S16). As shown in Fig. 5, both the 6-feature and 3-feature machine learning schemes in LightGBM give satisfactory performance in predicting surface energies of HANPs with different shapes within CVFF dataset. Good performance of lightGBM on the training set, validation set, and test set are shown in Supplementary Table S21.
There are some experimental methods developed for measuring the surface energy, as summarized in Supplementary Note 5. Unfortunately, the experimentally measured surface energies of HANPs have not been reported yet. The predicted surface energies of nanoparticles are awaiting the future experimental tests. Another experimental endpoint property is surface stress 26 (Supplementary Note 8). Different from the surface energy, the surface stress is evaluated from the geometry distortion and compressibility, which is also predicted by machine learning method using both 6-and 3-feature schemes (Fig. 3e). As shown from Supplementary Table S22 and Supplementary Fig. S17, satisfactory performance is achieved for 6-feature scheme (MAE: 1.16 N/m; R 2 : 0.90) and 3-feature scheme (MAE: 1.47 N/m; R 2 : 0.84).

Multilevel attention graph convolutional neural network
Although the above-mentioned explainable ML methods performed well in all the computation datasets, deep learning method with predictive power and without pre-setting up the feature descriptors is highly desired in rational design of materials. Here, we applied a multilevel attention graph convolution neural network, called DeepMoleNet 27 , to predict energies of HANPs. The atomic type, atomic number, out shell valence, van der Waals radius for each element, and atom node degree are used as node inputs and bond type, Gaussian expanded distance was selected as edge input, respectively (Supplementary Note 9: Supplementary Table S23). HANPs structures in both AIMD and DFT data sets including 27,033 data points were used in graph learning, among which 22,000 points were used as the training set, 2000 for validation set, and the rest nanoparticles as the test set. The learning results on DFT and DFTB datasets of DeepMoleNet are shown in Fig. 6a. Both lightGBM and deep learning methods show predicative power (Supplementary Tables S24-S26, Supplementary Fig. S18). It was also shown in Fig. 6b that the present graph convolutional neural network has good transferability to the larger-sized particles in DFTB datasets through the training with small-sized systems of DFT and AIMD data points. To see the energy correlation between two different levels (and scales) more clearly, the comparison made in Fig. 6b is shown in two regions (left: low surface energy; right: higher than 2 J/m 2 at DFTB level). The transferability test through training with DFT data and predicting on DFTB dataset, represented as DFT→DFTB, displayed that the deep learning predicted surface energies at DFT level are in good correlation with the DFTB computation results. Since the comparison is made between the DFTB surface energies and the transferred ones from DFT data, MAE is not available in Fig. 6b.
The scale function, λ, is further learned from Fig. 6b, which presents a relationship between the DFT and DFTB surface energies. As illustrated in Fig. 6c, the introduction of the learned scale function into the deep learning flowchart on multi-scale data could yield the surface energies of the medium or even large-sized HANPs with DFT accuracy.
Interface with experiments: XRD, surface potential, and protein-binding ability XRD simulations are useful to bridge the microscopic structure and the synthesized materials in experiments [28][29][30] . Recently, Parameter Quantification Network 31 (PQ-Net) was developed to predict scale factor, lattice parameter, and crystallite size of powder X-ray diffraction patterns from multi-phase systems of Ni-Pd/CeO2-ZrO2/Al2O3 catalytic material systems consisting of about 20,000 diffraction patterns. To apply the aforementioned ML models in real HANPs obtained from experimental synthesis, we realized the three-dimensional (3D) geometry reconstruction from two-dimensional (2D) TEM images by integrating the image segmentation techniques for extracting the objects in TEM images and machine learning algorithms (such as XGBoost and lightGBM) for analysis of these extracted objects (HANPs). After that, these nanoparticle models, characterized by geometry parameters (L, D, d cap , etc.) coming from TEM images (Fig. 7), were built upon satisfying the relationship between the faceted surfaces (Fig. 2).
We predicted XRD patterns for the selected HANPs, which are in good agreement with the experimental ones (Fig. 7). According to the XRD spectra, we further calculated the crystallinity of each nanoparticle. According to the experiments, HA-A and HA-E had far lower crystallinity than the other three HANPs, due to the absence of hydrothermal or calcinating process 6 . Our simulated XRD and crystallinity results perfectly reflect this critical step differences during fabrication process. Furthermore, crystallinity of HANPs was also an important indicator of induced apoptosis of the tumor cells. Among the HANPs, HA-A had the lowest crystallinity but the highest inhibitory effect on the viability of melanoma cells.
Variations in molecular shape and surface charge of HANPs lead to different interactions with the protein target 32 and selfaggregation. These different morphologies have distinct electrostatic potential surfaces (EPS), as shown in Fig. 8a and Supplementary Table S3. For the spherical nanoparticles, the weak positive and strong negative charges are distributed alternately, for both the charged (HA-S*) and neutral (HA-S) systems. In addition, the negatively charged spherical nanoparticles have repulsive electrostatic interactions between each other, making it relatively more difficult to aggregation and nucleation with the increase of concentration.
The HANPs can enter cells through endocytosis, induce apoptosis, and inhibit tumor metastasis. In vivo, the tumor cell membrane had locally positively charged regions which can permit the entrance of electronegatively charged particles. It was found that the negatively charged iron oxide nanoparticles were preferentially endocytosed by cervical carcinoma cells, as compared to the positively charged ones 33 . Based on our simulation here, the small-sized HA-S with the strongly negatively charged surface would be easily taken by tumor cells, which was in good accordance with our experimental findings (Supplementary Table  S3). Surface charge also plays a crucial role in the interaction with charged amino/proteins. It has been demonstrated that charged amino acids can regulate the nucleation of biomimetic HANPs (Supplementary Note 10).
As shown in Supplementary Fig. S19, EPS could be correlated with various properties such as surface energy, charge separation, and zeta potential. The surface potential in Supplementary Table  S3, which was estimated from surface charge distributions in the presence of water solvent molecules, is related to but not identical with the zeta potential. The former is at the surface while the latter is located at the solid-liquid interface in close proximity to the solid surface. The geometry descriptors such as the length (L), shape (L/D), Ca coordination number, and (0 0 1) facet percentage  Table S3, Supplementary Figs. S19 and S20).
It is interesting to find in Fig. 8b that the appearance of the faceted (0 0 1) surfaces in nanoparticles leads to the charge separation on the surface. As mentioned above, the nanoparticle with the faceted surfaces, especially for the needle particle HA-N1, whose surface presents a stripe-like pattern with alternating positive and negative electrostatic potentials, has a low surface energy, even close to that of the crystal surface. To further display the relationship between the surface charge separation and the occurrence of faceted (0 0 1) surface and XRD, some nanoparticles with the evident faceted (0 0 1) surface have been illustrated in Supplementary Fig. S21. With the increase in the particle size, the crystallinity is increased, especially for the rod-shaped nanoparticles. The surface energies of all the selected (0 0 1) facets terminated nanoparticles are very small and the charge separation takes place on the surface of rod nanoparticles.

DISCUSSION
In this study, both explainable machine learning methods and graph convolutional neural networks have been applied to predict the surface energies of various HANPs with different morphologies and sizes at different theoretical levels. Explainable machine learning methods showed predictive power in experimentally available surface energy and surface stress with just three features (L/D, d cap , N ca-contact ), providing important insights to experimentalists in a fast way. However, when the learning problems become more complex, for example, the surfaces become more irregular with more surface atoms, it is not easy to find a uniform explainable model to give acceptable prediction accuracy for all data. The prediction was made by dividing the data into two groups, meta-stable HANPs with surface energies larger than 2 J/m 2 and low energy nanoparticles of <2 J/m 2 . Developing proper descriptors for more complex structures of nanoparticles with the aid of both scientific intuitions and massive computations is still under way in our laboratory.
One the contrary, graph convolutional neural network gives high predictive power no matter how the surfaces change in a "black-box" manner. In this work, multilevel attention graph convolutional neural network for HANPs was designed and used for computational large nanomaterial graph modeling, which characterizes the structural diversity of nanomaterials and enables virtual screenings of nanoparticles. Good predictabilities were shown on surface energies of different morphologies and sizes with the introduction of the learned scale function. But it cannot provide clear physical insights underlying relationship between the features and the surface property.
To sum up, the current modeling strategy for HANPs is a universal tool for the rational material design of various nanoclusters without any laborious and complex feature engineering. The present ML flowchart and datasets will be further extended to the development of nanomaterials with desirable properties by fast virtual screening.

Generation of nanoparticles with different morphologies and sizes
The calculations were carried out on HAP crystal structure and HAP habit with the exposed surfaces ( Supplementary Fig. S1), at three different theoretical levels DFT (PBE/DND4.4), DFTB, and Consistent valence force field (CVFF), respectively.
According to the Wulff construction rule, the different morphologies of the nanoparticles are generated automatically by using the homemade codes in order to satisfy Eq. (1) and charge neutralization. It was stated that a crystal that has the lowest surface energy is constructed by satisfying the following relation, where (h k l) is the Miller index of the surface plane in crystal. The parameters of a and c are the lattice parameter of HAP. The parameter d hkl is the distance between model center and (h k l) surface. In a similar way, we define the parameter d cap , which is the distance between model center and outer faceted surface, for the HANPs. The equation was used for hexagonal crystal system material and the metal nanoparticles 34,35 . The surface energy, E layer surface of a slab model cutting from a certain crystal surface is calculated according to Eq. (2).
where E slab is the energy of the optimized slab model with N layers, E bulk is the energy of HAP crystal, and A is the surface of the exposed crystal surface. This equation is previously used for HAP, and gold surfaces [36][37][38][39] . The slab models were cleaved from HAP crystal with thickness of 1 layer, 2 layers, and 3 layers, respectively. The slab vacuum is set to be 15 Å. The simulation models are displayed in Supplementary Fig. S5. All the theoretical simulations showed the same trend as the c (0 0 1) has the lowest surface energy. The m (1 0 0) surface and x (1 0 1) exhibit almost the same surface energy. At the level of PBE/DNP4.4 with Grimme correction, x (1 0 1) shows relative high surface energy than m (1 0 0) as shown in Supplementary Table  S8. Our results also show consistency with other works (Supplementary  Table S9). Unlike the PBC model, the surface energies of HANPs are calculated according to Eq. (3).
Since the surfaces are irregular, we use the solvent-accessible surface area (SASA) of HANPs surface to describe the surface area A. The SASA is a very important feature descriptor used in the machine learning of surface energies.
The surface stress of a nanoparticle was evaluated from Eq. (4), where a is the lattice constant of bulk HA, which is taken as the Ca-P distance in this work. a is the relative lattice constant change in HANPs, r is Simulations of XRD and crystallinity are obtained from these 3D models. The surface energies are also predicted by LightGBM trained on DFTB dataset.
the mean particle diameter, and κ is the compressibility. In other words, Ca-P averaged distance is used to measure the relative lattice constant. The mean value of diameter, length, and d cap is used as surface radius.

Explainable machine learning models
It is of great importance to introduce explainable machine learning methods to understand the physical process in HANPs properties. To describe the morphology differences of these HANPs, 6 feature descriptors, including ratio of length/diameter (L/D), distance between HANP center and (1 0 1) surface cap, d Cap , number of atoms, N atom , SASA, SASA per number of supercell N supercell , and number of Ca 2+ coordination or contact within distance threshold N Ca-contact , were initially selected as the input data, as shown in Fig. 3. Furthermore, the adoption of only three geometric descriptors, i.e., L/D, d cap , N ca-contact also works well for a large number of HANPs with different sizes and shapes. Explainable machine learning models including XGBoost, LightGBM, Random Forrest, and GRBT are used in this work. The performance of lightGBM on training set, validation set, and test set are recorded (Supplementary Table S21).

Multilevel attention graph convolutional neural network
Applying the feature engineering and machine learning approach with the fixed feature length to predict the target surface energy property may be problematic. Computational nanomaterial modeling and virtual nanomaterial screening in feature engineering are not always successful for nanomaterials 17 . Nanoparticles with irregular shapes and sizes arouse the challenge for finding the suitable feature descriptors. Complex feature engineering for considerable expertise chemistry knowledge is required in the traditional pattern-recognition or machine-learning way. Such feature engineering is important but labor-intensive and highlights the weakness of current learning algorithms: inability to extract and organize the discriminative information from the data [40][41][42] . Graph neural network has been widely used in many chemical applications [43][44][45][46][47][48][49][50][51][52][53][54][55][56][57] . Various neural networks have been developed. Among these methods, DeepMoleNet is a kind of graph convolutional neural (GCN) combing multi-level attention with chemical descriptors through multi-task learning that achieves the state of the art performance on several datasets 27 . The attractive idea is to apply GCN networks for materials property prediction since chemical structures can be represented in the form of graphs with atoms for nodes and bonds for edges. DeepMoleNet belongs to the MPNN architecture. That means that the computation complexity increased with the edge number, which limits its application toward larger graphs as they treat every graph as fullyconnected graphs. In a more detailed way, DeepMoleNet calculation depends on edge network A e vw ð Þh w , thus the time complexity per layer has an order of O(mc 2 ) with width c and the number of edges m strictly. Edges grow in the n 2 scale with nodes. For each input, it would be processed to the fully-connected state. It is rather difficult to treat largesized molecules and nanoparticles. One way to speed up the calculation is to cut down the edge connections. According to our chemical intuition, it is reasonable to cut the fully connected atoms with center atom v in a cutoff radius and only consider those nearest neighbors, which influence the center core atom the most. The fully-connected MPNN achieved the best representation performance. Graph connections outside the radius should also be connected. However, to reduce the cost, atoms outside the radius were treated as environment, only parts nodes v0 of that would be needed to simulate the real situation. If one cluster has 1000 atoms, the atoms within the radius are <6. The edges would sharply decrease to save the computation cost. The strategy can be formalized in the following way.
Within the cutoff, the atom-atom connection takes the same as GCNN. Out of the cutoff, we randomly sample small part to represent the outer shell chemical environments. In this way, for the big molecules graph, the atom only strongly correlates with a certain small number of atoms (<10 atoms) within the cutoff radius. The calculation cost drops provided that the message received from neighbor part is sparse. The parameters used in this work are shown in Supplementary Table S23. The performance of DeepMoleNet on training set, validation set, and test set is provided (Supplementary Table S26).

Simulation of XRD and crystallinity
For the HANPs, spherical, rod-shaped, plate-like, and needle-like models with different morphologies were constructed by image segmentation of TEM pictures (Supplementary Note 10). The goal of image segmentation is to the partition of an image into various regions that each region is homogenous or similar in terms of some patterns. Thus, it is useful in differentiating the foreground from the background. During the segmentation procedure, the code assigns one label to each pixel of an image, so that pixels share similar characteristics are assigned as the same label. It is based on the assumption that the objects and the background in the image have a bimodal distribution. The threshold segmentation is one of the most popular methods for image segmentation. Threshold segmentation transforms input image to a binary image, e.g., by grouping the pixels with intensities, higher than the threshold value into one class, and the remaining pixels into another class 58 . Then it would be much easier to judge the shapes of nanoparticles of different morphologies. Here, we employ the Ostu method 59 to determine the value of the global threshold and then perform image segmentation procedure according to this value. Afterward, the diameter and length parameters are obtained for the input of 3D structure generation, XRD simulation, and machine learning of surface energies. According to the XRD spectra, we calculated the crystallinity of each nanoparticle, according to the equation shown below, where β 002 is the half width of the (0 0 2) peak index. It should be mentioned that (0 0 2) index is also corresponding to the (0 0 1) surface property.
Electrostatic potential surface and surface potential prediction Different nanoparticles' morphology can determine their binding specificity through electrostatic interactions. To simulate the surface potential in solvents, we embedded the HANPs into the explicit solvent molecules.
Here, the solvents are water molecules. We then calculated the surface potential using the partial charges of atoms within 5 Å of the solvent.

DATA AVAILABILITY
Source data are provided with this paper and also at the website (http://www. webace-i3c.com/ATTRMaterialDatabase/home/home). Source data file for anti-tumor effects is provided as Supplementary Data. Computational data are also available upon request.