Convolutional neural network-assisted recognition of nanoscale L12 ordered structures in face-centred cubic alloys

Nanoscale L12-type ordered structures are widely used in face-centred cubic (FCC) alloys to exploit their hardening capacity and thereby improve mechanical properties. These fine-scale particles are typically fully coherent with matrix with the same atomic configuration disregarding chemical species, which makes them challenging to be characterized. Spatial distribution maps (SDMs) are used to probe local order by interrogating the three-dimensional (3D) distribution of atoms within reconstructed atom probe tomography (APT) data. However, it is almost impossible to manually analyse the complete point cloud ($>10$ million) in search for the partial crystallographic information retained within the data. Here, we proposed an intelligent L12-ordered structure recognition method based on convolutional neural networks (CNNs). The SDMs of a simulated L12-ordered structure and the FCC matrix were firstly generated. These simulated images combined with a small amount of experimental data were used to train a CNN-based L12-ordered structure recognition model. Finally, the approach was successfully applied to reveal the 3D distribution of L12-type $\delta^\prime$-Al3(LiMg) nanoparticles with an average radius of 2.54 nm in a FCC Al-Li-Mg system. The minimum radius of detectable nanodomain is even down to 5 \r{A}. The proposed CNN-APT method is promising to be extended to recognize other nanoscale ordered structures and even more-challenging short-range ordered phenomena in the near future.


Introduction
of images in this study) is a computationally intensive task and almost impossible if manually attempted. Thus, an automatic ordered structure identification method is required. Note that the previous solution 17, 18 is to exploit the difference in compositions between ordered structures and matrix and then analyse SDMs of different subsets divided by isocomposition surfaces of a certain composition, like at 8 at.% Li in an Al-Li-Mg system 17 . However, the filtering method cannot ensure that atoms from the matrix are not included, i.e., it is a little arbitrary to choose 8 at.% Li as the dividing line 17 . Moreover, this approach will become invalid when there is little difference in compositions between ordered structures and matrix, like the L10 ordered structure in Au-43Cu-7Ag (at.%) alloy 19 , or when characterizing short-range ordered structures 13,20,21 .
Recently, machine learning has been applied to APT data to automate the identification of a specimen's crystallographic orientation or improve microstructural feature extraction 22,23 .
Machine learning algorithms have the potential to unveil ordered structures by learning characteristic patterns in experimentally obtained SDMs. As a representative in the field of image recognition, CNNs have been used to automate the identification of microstructural and crystallographic features using micrographs 4,24,25 . A remarkable advantage of CNNs is the automatic extraction of features with minimal human intervention 26,27 . The essence of image recognition via CNNs is to extract different levels of features such as low-level edges and colour features as well as more abstract features through a series of convolutional and pooling layers. Different crystal structures will generate different interplanar spacing in zx-SDM patterns with different relative colour scaling. These edges and colour features in SDMs are the key pieces of information for determining the structural types. Another advantage of CNNs for computer vision is its translation invariance but this phenomenon was not met in our study due to the pre-defined image generation procedure.
In this work, a CNN-based strategy is proposed to automatically recognize nanoscale L12-type ordered structures in FCC-based alloys using APT data with an ultra-high recognition ability.
Firstly, a crystal structure library was built to include a wide range of possible configurations to then feed into producing many simulations of APT data, all based on either the L12 or FCC crystal structure. From these simulated structures, the corresponding zx-SDMs along with specific crystallographic direction were generated. The obtained SDMs (used as inputs) combined with their corresponding crystal structures (used as labels) were divided into training, validation, and test datasets, which were then used to train a CNN to generate an L12 ordered structure recognition model. A second training procedure was also performed after enriching these synthetic datasets with few experimentally obtained data, to enhance the model performances and speed up the training. Finally, the experimentally-obtained SDMs from an Al-Li-Mg alloy were input into this recognition model to identify the 3D distributions of the L12-type d¢-Al3(LiMg) particles in the FCC matrix. This result was further compared with the previous isocomposition approach to highlight its advantages.

APT data analysis
The typical 3D atom probe tomography of Al-6.79Li-5.18Mg (at.%) alloy is shown in Figure 1  A parallel python-based program code was made to quickly scan the dataset shown in Figure 1 Figure 1 (e). This difference allows one to recognize the structural difference and thus highlight the sites of L12 particles.

Simulation of zx-SDMs
For the machine learning part, the first step was to build the synthetic dataset to train CNN parameters. Here, a crystal structure library was built to include various possible configurations, around the FCC and L12 crystal structures. Several parameters were considered to construct the crystal structure library, including lattice parameters, crystallographic rotation, simulating limited spatial resolution, loss of atoms simulating imperfect detection efficiency, and data size (number of atoms). Then, the simulated zx-SDMs along with [011] were generated from these simulated configurations. Note that [011] is equivalent to the [1 $ 01] pole shown in Figure 1 (c).
A procedure to generate the simulated zx-SDMs is shown in Figure 2. Firstly, the FCC and L12 crystal structures with a volume of 25nm . were built based on the posgen program 28 , as shown in Figure 2 (a). The lattice constant of the FCC-Al structure was defined as 0.405 nm 29,30 . The same lattice constant was set for the L12-type Al . (MgLi) structure due to the fully coherent nature 17 .
Note that Mg and Li atoms were not separately labelled in this model because only the zx-SDMs of Al-Al pairs were used to make structure recognition in this paper. Then, the Euler transformation was applied to change the projection pole from [001] to [011], as shown in Figure 2 (b). Thirdly, certain levels of Gaussian noise were added to shift the atoms in x, y, and z reconstruction directions to model finite spatial resolution, as shown in Figure 2 (c). Note that the standard deviation (σ) of Gaussian noise in the z-direction is smaller than those in x and y directions, simulating the higher resolution in the depth direction. Fourthly, a certain fraction of atoms were randomly removed to simulate imperfect detection efficiency (see Figure 2 (d)). Finally, the corresponding zx-SDMs of Al-Al pairs of two crystal structures with different parameters were generated to build the zx-SDMs dictionary, as shown in Figure 2 (e). Table 1 summarizes the parameters used for building a crystal structure library and generating the corresponding zx-SDMs dictionary. Two kinds of crystal structures were included with different noise levels and detection efficiencies. The upper and lower boundaries of the Gaussian noise levels were chosen based on the similarity between the simulated and measured zx-SDMs. The adjusted detection efficiency is used to change the number of atoms, and thus it is not kept a fixed experimental value. The generated zx-SDMs dictionary was augmented by generating more zx-SDMs which were rotated certain angles randomly. The rotation augmentation was applied to simulate the observed small-angle pattern rotations in the experimental zx-SDMs. In total, 18416 simulated images were included in this dictionary and divided almost equally into two classes. A noise estimation 31 was made on the simulated and experimental images, respectively, as shown in Supplementary Figure 1 (a) and (b). The simulated data can well embrace the noise level of the experimental zx-SDMs. Note that adding different levels of noise to the input data of the CNN can help in out-of-distribution generalization and transfer to the real data 32 .

Network configuration
The simulated z-SDMs were split into 90% for training and validation and 10% for test. Five-fold cross validation was exploited to train the model. The used images were 150×150 pixels greyscale images with one channel of input, whose pixel values were between 0 (black) and 255 (white).
The adopted CNN is shown in Figure 3 (a) and it consists of a six-layer structure with (plus four convolutional max pooling) layers and two fully connected layers (containing the last output layer).
The detailed architecture of each layer is shown in Figure 3  Two cases were performed: (1) all training and validation datasets consist of simulated data while test dataset consists of simulated data and experimental data; (2) training, validation, and test datasets consist of simulated data and experimental data. Here, 7 epochs were used to find the minimum cross-entropy loss and the entire training process only took approximately 14 minutes on an Intel Core i7-9700 CPU 3.00GHz. To solve this, in case 2, only 12 experimental zx-SDMs corresponding to the FCC structure were augmented into 84 samples using the same method as the simulated data and added to train CNN.

Training, validation and test results
The same procedure was implemented and a model was obtained with similar training and validation losses of about 1.3 × 10 :; . The history of loss values is shown in Supplementary Figure   2 (b). This model was also tested using the 10% test dataset (only containing the simulated data) and 100% classification accuracy was made. The predicted results of the 48 experimental zx-SDMs were shown in Figure 4 (b). As compared with Figure 4 (a), this model exhibited very good prediction ability on both the simulated and experimental data. When a near-zero value was predicted, the test image is close to the zx-SDM of FCC structure, while a near-one value signified similarity with the zx-SDM of L12 structure. Note that a value around 0.5 means that the test image is a mix of the two structures, as shown the image number 27 in Figure 4  The zx-SDM of zone 1 exhibits obvious L1 < signature, while this signature is unclear in zone 2 or 3. Finally, the value of 62 was taken as separating the two different crystal structures, which is closer to the AUC value in case 2 multiply by 64, i.e., 63.68. Figure 6 (a) shows the 1-nm voxels map with the L12 structure probability above 62. The corresponding nanoparticle size distribution is shown in Supplementary Figure 4, but more APT data is needed to give a statistical result. This will be used as an input into microstructure and strength models to build the structure-property relationship 30 . Note that the recognized minimum precipitate radius can be down to 0.5 nm, suggesting the ultra-high recognition ability. Figure 6 (b) and (c) show the species-specific z-SDMs for the segmented FCC matrix and L12 structures, respectively, plotted with arbitrary units for ease of comparison. All peak-peak distances in Figure 6 (c) corresponded to the interplanar spacing of the FCC matrix, while all peak-peak distances in Figure 6 Figure 7 (a). For example, some focus on the edges, while others highlight the foreground or background. As going deeper into the CNN structure, the model identifies more abstract concepts. At this step, we often cannot interpret these deeper feature maps. In deeper layers, the model identifies more abstract concepts.
The output of Grad-CAM is a heatmap visualization for a given class label 37,38 . We can use this heatmap to visually verify where in the image the CNN is looking. As can be seen from Figure 7 (b), the obtained CNN model mostly focuses on the interplanar spacing feature in the deeper convolutional layers, which is the desired result.

Discussion
In this paper, a CNN-assisted APT approach has been successfully applied to recognize L12 ordered precipitates in the FCC matrix with an ultra-high recognition ability. The proposed CNN-APT approach has several advantages over the traditional method based on isocomposition thresholding. The most important is that the traditional method is only based on the differences in compositions, while the present method attempts to take into account the entire crystal structure information including the occupancy sites and types of different atoms, more exactly, how this crystallographic information manifests its signature in 2D zx-SDM images. This enables the proposed method to precisely recognize ordered structures in different crystal materials. In terms of the traditional method, on one hand, it is arbitrary to choose one value to filter matrix data based on the isocomposition, like below 8 at.% Li in this Al-Li-Mg system (Figure 6 (d)). It is hard to ensure that matrix atoms are not included in such precipitate characterisation, which will significantly affect the composition measurements 17 . The proposed CNN approach has the capability of classifying the different crystal structures distinguishably. As shown in Figure 5, two obvious peaks were observed and 62 was reasonably chosen to filter the data. This chosen threshold matched well with the obtained AUC value in case 2 multiply by 64, i.e., 63.68. As shown in Figure 6, the average radius of precipitates from the isocomposition and proposed methods is 2.59 ± 0.9 and 2.54 ± 1.03 nm, respectively. On the other hand, the isocomposition method could fail when encountering the weak differences in compositions, especially like shortrange ordered structures occurring in Ti and high-entropy systems 13,20,21,39 . The proposed method based on full crystal structure information is quite promising to handle this challenge in the near future.
Moreover, a 1-nm voxel, i.e., with 0.5 nm radius, is identified as the L12 structure only when the sum of the predicted probabilities of the surrounding overlapped 4-nm voxels is above 62. This suggests that the average predicted probability of the individual 4-nm voxel is above 0.96875. As shown in Supplementary Figure 3 However, this signal is weak and that is why the authors finally employed larger voxels to make the CNN recognition. Overall, the minimum radius of the detectable nanoparticle is down to 0.5 nm using the proposed method.
The CNN employed in this work handles a piece of zone of one image using filters, which enables the neural networks to watch a field rather than a pixel 24 . Each convolutional layer contains several filters to scan this image using a specific size kernel. As shown in Figure 7, through the first convolutional layer, clean edge and grey features are detected. With the deepened convolutional layers, more abstract and sparser features are obtained, and the CNN model mostly focuses on the desired interplanar spacing feature. If the deeper convolutional layers are not added, a higher loss value will be obtained like Figure 3 (c). This ablation study highlights the importance of deeper convolutional layers. Two case studies have been performed to train the CNN with or without real experimental data. The obtained models from the two cases exhibited quite a high classification accuracy in the simulated test dataset, but only the second case with some experimental data performed well in both simulated and experimental test datasets. This is attributed to the more complex situation occurring in experimental data which has not been fully involved in the simulated data.
It is worth mentioning that the transfer learning was considered in the beginning but finally given up. Firstly, applying the transfer learning requires enough data to train the neural networks 40 .
Here, we only employed 12 experimental zx-SDMs corresponding to the FCC structure which were further augmented into 84 images. These should not be enough for performing the transfer learning. Moreover, as mentioned above, the addition of noise helps in out-of-distribution generalization and transfer to the real data 32 . In the case 1 with only simulated data, the poor performance is mainly that several images corresponding to the FCC structure were wrongly given high L12 probabilities. Thus, the authors utilized a small set of experimental data to further embrace the complex FCC structure. This made the present model perform well in making the L12 structure recognition.
In this paper, zx-SDMs have been successfully used to make L12 structure recognition via CNN.
Another possible analysis way is to deal with the experimental z-SDMs (like Figure 6 (b) and (c)) and some curve analysis methods could be performed, like 1D CNN. In addition, the potential of applying 3D CNN to directly handle 3D APT cloud points could be explored, although it may be quite difficult. Note that the uncertainty quantification 41,42 , as a non-trivial question and current research hot topic 43 , should be considered in the next step, which is one of the challenges of using CNNs for scientific application. Moreover, the proposed method can easily be extended to other ordered structures encountered in FCC alloys. One only needs to build an appropriate crystal structure library and corresponding zx-SDMs dictionary. The methodology could also be extended to BCC and HCP alloying systems in the future. It should be pointed out that the success of the proposed CNN-APT method requires the occurrence of the pole structures in the detector event histogram (like Figure 1 (b)) where APT data exhibits the high depth resolution. The tomographic reconstruction is often calibrated by using the pole structures 17 . The pole information can be found in various metallic materials, such as aluminium, magnesium, and titanium alloys. There are several other methods to extract precipitates within matrix using APT data, such as pair correlation function (PCF) and K-nearest neighbour (KNN) distance analysis 15 . Two obvious advantages of the two approaches are that they do not require the occurrence of pole structure and involve examining the average local neighbourhood as a function of distance along all directions. A drawback is that the information along the lateral direction having the lower resolution could hinder the recognition of small-scale clusters/precipitates. In fact, the SDM technique is very similar to the PCF but along a particular crystallographic direction with the higher resolution, and thus the best structure information can be exploited to reveal the potential clusters/precipitates. In the future, it is promising to explore to make ordered structure or cluster recognition by coupling the proposed CNN framework with those other approaches based on localized compositional measurements or solute pair distances, especially when no pole occurs.
In conclusion, this is a demonstration of the potential of the CNN-based method for ordered structure recognition within APT data. It is demonstrated that this image recognition approach has the capability of revealing nanoscale L12 ordered particles in FCC system using simulated zx-SDMs and a small amount of experimental data. The minimum radius of detectable nanodomain can be down to 0.5 nm. As compared to the traditional method based on isocomposition, the proposed CNN-APT approach is more outstanding in revealing L12 ordered precipitates with the average radius of 2.54 nm in the FCC Al-Li-Mg system. The next work is to extend this proposed methodology to more challenging short-range ordering phenomena.

APT experiments
The studied APT data (45 million) of Al-6.79Li-5.18Mg (at.%) (Al-1.8Li-5Mg (wt.%)) aged for 8 h at 423 K is from Ref. 17 . The Cameca Inc. LEAP 3000XSi was applied to gather atomic-scale data with a 55% detection efficiency. IVAS 3.8.4 was used to make data reconstruction and visualization. The reconstruction parameters, i.e., the field factor and image compression factor, were calibrated by the method introduced in Refs. 44,45 .

Convolutional neural networks
For the used CNN, all layers used ReLu (Eq. 1) as the activation function, except the output layer which used Sigmoid (Eq. 2) for classification purposes.

ReLu( ) = max(0, )
(1) The two kinds of activation functions were applied to each neuron in CNN to determine whether the neuron should be activated or not. They also help normalize the output of each neuron to a range between -1 and 1 or between 0 and 1. The binary cross-entropy 46 was chosen as the loss function, which is often used to train a binary classifier. The loss value is given: where is the label (0 for FCC and 1 for L12 structure) and ( ) is the predicted probability of each image corresponding to the L12 structure for all images. The training was performed using