Abstract
Stochastic microstructure reconstruction has become an indispensable part of computational materials science, but ongoing developments are specific to particular material systems. In this paper, we address this generality problem by presenting a transfer learningbased approach for microstructure reconstruction and structureproperty predictions that is applicable to a wide range of material systems. The proposed approach incorporates an encoderdecoder process and featurematching optimization using a deep convolutional network. For microstructure reconstruction, model pruning is implemented in order to study the correlation between the microstructural features and hierarchical layers within the deep convolutional network. Knowledge obtained in model pruning is then leveraged in the development of a structureproperty predictive model to determine the network architecture and initialization conditions. The generality of the approach is demonstrated numerically for a wide range of material microstructures with geometrical characteristics of varying complexity. Unlike previous approaches that only apply to specific material systems or require a significant amount of prior knowledge in model selection and hyperparameter tuning, the present approach provides an offtheshelf solution to handle complex microstructures, and has the potential of expediting the discovery of new materials.
Introduction
Under the Materials Genome Initiative (MGI)^{1}, materials informatics has become a revolutionary interdisciplinary research area fundamentally changing the methods to discover and develop advanced materials. In past success using materials informatics, stochastic microstructure reconstruction – the process of generating one or a few microstructures with morphology embodied by a set of statistically equivalent characteristics – has demonstrated its significance in both processingstructureproperty modeling^{2,3,4} and computational materials design^{5,6}. Therefore, the prescription of these microstructural characteristics is crucial in determining the effectiveness of microstructure reconstruction. Existing approaches for quantifying microstructure characteristics can be roughly classified into three major categories, i.e., approaches that are statistical modelingbased, visual featuresbased and deep learningbased.
Statistical modelingbased approaches employ statistical models or attributes (e.g., mean particle size) to quantify microstructure morphology or features. These methods are widely used, yet their application to microstructure reconstruction is often limited to certain types of material systems and cannot be generalized. For instance, while Npoint correlation functions^{7} are theoretically sound for microstructure characterization, it is computationally intractable to use highorder correlation functions (e.g., 3point correlation and above) for microstructure reconstruction. The physical descriptorbased approach^{8} is often limited to characterizing and reconstructing microstructures with regular geometries (e.g., spherical clusters) but is not applicable to material systems with irregular inclusions (e.g., ceramics or copolymer blends). Another set of examples of statistical models are approaches based on Gaussian Random Fields^{9} and Markovian Random Fields^{10,11}. A limitation of these approaches is the assumption that locally invariant properties always hold throughout microstructures, which is not always the case.
In the last decade, visual features used for object classification or face detection in the field of computer vision have been utilized by material scientists to characterize microstructures and to study structureproperty relationships. For instance, DeCost et al.^{12} used bag of visual features such as ScaleInvariant Feature Transform (SIFT) to collect a “visual dictionary” for describing and classifying microstructures. Chowdhury et al.^{13} utilized visual features such as histogram of oriented gradients (HoG) and local binary patterns (LBP) to distinguish between micrographs that depict dendritic morphologies from those that do not contain similar microstructural features. Despite these successes, the use of visual featuresbased approaches in microstructure reconstruction is unexplored and potentially limited because these visual features are essentially loworder abstractions of microstructures, thereby rendering reconstruction of statistically equivalent microstructures using only these abstractions in the absence of highorder information difficult.
Revived from near pseudoscience status during the “AI winter”^{14}, deep neural networks, which feature large model capacities and generalities, have stimulated a plethora of applications across different disciplines^{15,16,17,18,19,20,21} (including materials science) in recent years. Existing deep learningbased approaches in materials science fall into two categories: materialsystemdependent or independent. Materialsystemdependent approaches train deep learning models based on collected materials data, with their subsequent applications are often limited to the material system used for training. For instance, Cang et al.^{22} extracted microstructure representations for alloy systems using convolutional deep belief networks. Their model^{22} was trained with 100 images of size 200 × 200 pixels. While their model generated satisfactory reconstruction results for the chosen alloy material, their model was highly constrained to the type of alloy system used for the training set. Li et al.^{23} developed a Generative Adversarial Network (GAN)based model to learn the latent variables of a given set of synthetic microstructures, but their model needs to be retrained for application to a set of microstructures with significantly different dispersion. In contrast to these materialsystemdependent approaches, transfer learning provides an alternative to capture microstructure characteristics without the need for training with a set of materials data (i.e., it is materialsystemindependent). Transfer learning^{24,25,26} refers to the strategy of migrating knowledge for a new task from a related task that has already been learned^{27}. In the context of microstructure analysis, deeplearning models trained for benchmark tasks using computer vision are fully or partially adopted to quantify microstructures or to address other complex challenges. For instance, DeCost et al.^{28} utilized a transferred deep convolutional network to capture hierarchical representations of microstructures and then used these representations to infer the underlying annealing (i.e., processing) conditions. Lubbers et al.^{29} adopted the VGG19 model^{30}, trained on ImageNet^{31}, and used the activations of its network layers as microstructure representations to identify physically meaningful descriptors (e.g., orientation angles) via manifold learning from a set of microstructures. Nevertheless, none of these newly developed transfer learningbased approaches has addressed the challenge of microstructure reconstruction, where the extracted features from a network need to be reproduced in a statistically equivalent way. It should be noted that in Lubbers et al.^{29}, a prior texture representation based on the activations of deep convolutional layers previously developed by Gatys et al.^{32} was implemented to synthesize visually similar microstructures with the same texture representation. However, the more challenging problem of achieving statistical equivalency in microstructure reconstruction, which could not be guaranteed by visual similarity, was not addressed in their work. It is also noteworthy that, in the prior transfer learningbased works such as Lubber et al.^{29}, the transferred neural network model is considered as a “black box” and there lacks understandings of the relationship between the network layers and the microstructure characteristics.
In the present study, a generalized transfer learningbased, trainingfree approach is proposed for reconstructing statistically equivalent microstructures from arbitrary material systems based on a single given target microstructure. The input microstructure with labeled material phases is first passed through an encoding process to obtain a 3channel representation in which material phases are distantly separated. In the meantime, the initial 3channel representation of the reconstructed microstructure is randomly generated as the initialization. In each iteration of the reconstruction process, both of the 3channel representations of the original and reconstructed microstructures are fed into a pretrained deep convolutional network, VGG19^{30}, and a loss function is utilized to measure the statistical difference between the original and the reconstructed microstructures. The gradient of the loss function with respect to each pixel of the reconstructed microstructure is computed via backpropagation and is then utilized in gradientbased optimization to update the reconstructed microstructure. Finally, the updated 3channel representation of the reconstructed microstructure is propagated through a decoding stage via unsupervised learning to obtain the reconstructed microstructure with labeled material phases. In addition to visual similarity, statistical equivalence of the reconstructed microstructure is achieved by the encodingdecoding pair, which ensures sharp phase boundaries with correct labeling for the phase of each pixel. In addition, to ensure the computational viability of the proposed approach, model pruning is conducted on the transferred deep convolutional network. For validation, microstructures generated by differently pruned models are evaluated via visual inspection, numerical validation, and the calculation of receptive fields, which are defined as the regions in the input space that influence a particular convolutional neural network feature. The correlation between network layers and microstructure dispersion is also concurrently analyzed. Finally, as an extension, the knowledge learned in model pruning is utilized in determining the architecture and initialization conditions in developing a structureproperty predictive model. A numerical validation using a small dataset of microstructures and their optical properties is conducted in order to verify the proposed structureproperty modeling approach.
Microstructure Reconstruction
The proposed transfer learningbased approach for microstructure reconstruction migrates a pretrained deep convolutional network model^{30} created using ImageNet^{31} – an auxiliary dataset which contains millions of regular images – and adds encodingdecoding stages before and after the deep convolutional model, as illustrated in Fig. 1.
Encoding
The transferred deep neural network has very strict requirements for data entry in terms of the image size and 3channel representation alignment. Therefore, we encode the original microstructure in which each pixel is labeled with material phases into 3channel representations so that the dimensionality of the input image fits the requirements of the transferred deep convolutional model. For ease of distinguishing individual phases after reconstruction, we employ maximizeminimum (maximin)^{33} distance mapping from phase labels to the 3channel representations.
Gradientbased microstructure reconstruction
Three steps are applied in transferring the deep convolutional neural network into microstructure reconstruction. (a) Removal of highest network layers: It is well recognized that higherlevel layers, particularly the last fully connected layer, are discriminators tuned specifically for the image classification task. For our new task of microstructure reconstruction using the transferred VGG19 model based on nonmaterial images, we eliminate the highest 7 layers (3 fullyconnected and 4 convolutional layers with the associated pooling and dropout layers) (see details in the “Model Pruning” section below). (b) Grammatrix computation: Grammatrix^{32}, which is usually used for measuring the differences in textures between images, is taken as the measurement of statistical equivalence between the original microstructure and the reconstruction. We implement its forward and backward computations (i.e. the calculation of Grammatrix and its gradient) by customizing a computation unit and integrating it with the transferred model. On each layer of the convolutional deep network, first the Grammatrices of the original and reconstructed microstructures are computed based on the activation values, then the differences of the Grammatrices on the corresponding layers between the original and reconstructed microstructures are added as the optimization objective. (c) Gradient computation via backpropagation: The stateoftheart deep learning platform provides a fast and handy way of gradient calculation via backpropagation through the computation graph. The gradient of the objective (Grammatrix difference) with respect to reconstruction image pixels is thus calculated via backpropagation. The gradient is then fed into nonlinear optimization (either LBFGSB^{34} or Adam^{35}) to update the reconstruction iteratively until convergence is found. It is noted that stochasticity of the microstructure reconstructions is achieved by random initialization of the microstructure image before the backpropagation operation.
Decoding
After obtaining the 3channel representation of the reconstruction, an unsupervised learning approach is used to convert the 3channel representation back to the desired representation: images with labeled material phases. Furthermore, considering that volume fraction (VF) is a critically important microstructural descriptor, a Simulated Annealingbased VF matching process is exercised at the end to ensure the reconstructed image has the same VF as the original through erosion or dilation.
Model pruning of the transferred deep convolutional network
While the proposed microstructure reconstruction approach is capable of generating not only visually similar but also statistically equivalent microstructures, its consumption of computational resources is significant, which hinders its wide application on computational platforms with limited capacity. Two major bottlenecks are the GPU memory consumption and the number of backpropagation operations. High GPU memory consumption would result in numerical errors in lowerend computational platforms, which hinders the wide application of the proposed approach, and a great number of backpropagation operations would significantly slow down the speed of gradient computations. Given that both GPU memory consumption and the number of backpropagation operations are affected by the depth of a deep convolutional network, we set our objective to reduce the hierarchical depth of the transferred model for computational economy and efficiency. The model pruning is implemented for this purpose in two steps: (1) we gradually remove the top layer(s) from the existing model and generate reconstructed microstructures, and (2) we analyze the tradeoff between model depth and reconstruction accuracy by both visual inspection and numerical validation using twopoint correlation function and linealpath correlation function. We also compute the receptive field of each pruned model and investigate how dispersion in the microstructure determines the size selection of receptive fields, which plays a decisive role in model pruning.
Structureproperty prediction
While microstructure reconstruction approaches are capable of generating statistically equivalent realizations, it is always computationally costly to simulate the material properties of these microstructures via Finite Element Analysis (FEA). Fortunately, as the data of structureproperty mapping is accumulated, it becomes feasible to train machinelearning models, which have significantly shorter runtime than FEA, to replace the physical simulations. As deep convolutional networks become increasingly prevalent, a common practice of building a structureproperty predictive model is to transfer an existing pretrained model either fully or partially, and finetune the weighting using backpropagation^{36}. A crucial choice in this transfer learning process is to determine which part of the pretrained network should be adopted. In the existing studies that transfer the pretrained model, this choice varies a lot. For instance, Yosinski et al.^{36} adopted the full AlexNet^{18} in an image classification task while Li et al.^{23} only used the first four convolutional layers in a pretrained Generative Adversarial Network (GAN) model to build structureproperty prediction of optical materials.
While the determination of which portion of the pretrained model to adopt is usually subjective, the aforementioned model pruning study provides an objective guideline. Since the pruning of the reconstruction model reveals the necessary part of the pretrained model to be transferred, this results in a rule specifying the network architecture and initialization conditions. In this paper, the proposed approach is compared with two different network architectures, demonstrating enhanced stability for the former.
Results and Discussion
Material systems
Two different datasets have been prepared for the tasks of microstructure characterization and reconstruction (MCR) and structureprediction, respectively. First, a dataset of microstructure images obtained by stateoftheart microstructure imaging techniques, covering carbonate, polymer composites, sandstone, ceramics, a block copolymer, a metallic alloy and 3phase rubber composites (the 1^{st} row in Fig. 2) has been collected for demonstration and validation of the proposed MCR approach. Given the great variety of microstructural morphologies, this dataset provides a comprehensive testbase for comparing our proposed approach to other MCR approaches. Among all test samples, special attention is given to two challenging systems – 2phase block copolymer and 3phase rubber composites. The block copolymer sample has a fingerprintshaped microstructure, in which anisotropy is observed locally whereas isotropy holds globally. In contrast, the rubber composite sample has higher local isotropy, yet its threephase nature is difficult to capture using any prior approach. For the second task of structureproperty prediction, an additional dataset consisting of structureproperty pairs was obtained by generating 5,000 microstructure patterns using the Gaussian Random Field^{9} (GRF) method (a popular and computationally efficient choice to generate microstructures of optical materials) with a wide range of correlation parameters, followed by subsequent simulation of their light absorption rates at a wavelength of 600 nm using Rigorous Coupled Wave Analysis (RCWA). RCWA is a Fourier spacebased algorithm that provides the exact solution to Maxwell’s equations for electromagnetic diffraction. While we set the diffraction order in RCWA such that each simulation takes less than 5 minutes to complete, more accurate simulation solutions can be obtained by choosing higher diffraction orders.
Validation of Microstructure Reconstruction Results
The quality of microstructure reconstruction is assessed both quantitatively based on numerical metrics and qualitatively through visual inspection. As the twopoint correlation function is the most commonly used statistical function to evaluate microstructure reconstructions^{8,10}, we adopt it as one of the quantitative evaluation metrics in the present work. However, per Torquoto^{37}, the twopoint correlation function itself is not sufficient in evaluating the statistical equivalence of microstructures. In this work, the linealpath correlation function^{38} is used as an additional metric to quantify the statistical similarity between the original and reconstructed microstructures^{10}. Since most statistical functions are reduced representations of microstructures, they cannot reveal the microstructure characteristics completely. To this end, visual inspection was also conducted as a complementary validation to the numerical comparisons.
As depicted in Fig. 2, in addition to the proposed transfer learningbased approach (Row 2), four existing MCR approaches (i.e., decision treebased synthesis^{10}, Gaussian Random Field^{9,39,40}, two point correlation^{7,41}, and physical descriptor^{8}) are applied to each of the microstructures in the collected data, except for the threephase rubber composite sample in the last column since none of the commonly used approaches can process the multiphase microstructure of the rubber composite (three materials phases). The discrepancy between the correlation functions of the original microstructure and those from reconstructions is measured by
where S_{1} is the area between the two correlation functions and S_{2} is the area under the correlation function of the original microstructure. The error rates of the reconstruction for each method in each material sample are illustrated in Tables 1 and 2. It should be noted that in the copolymer and ceramic samples, the white material phase is almost all connected, and thus it is inappropriate to apply the physicaldescriptor based approach. In addition, for the alloy material system (Fig. 2, column 6), the proposed approach is significantly better than other microstructure reconstruction approaches in generating visually satisfactory reconstructions. Since visual similarity between the original and reconstructed microstructures is a necessary qualitative criterion to validate the equivalence of microstructures, we do not conduct further numerical validation for the alloy material system in the later part of this paper.
From Table 1, we find that the proposed transfer learning based approach for using convolutional deep networks outperforms all other reconstruction approaches in four out of the five material systems being numerically evaluated. For the sandstone sample, the accuracy of reconstruction using the proposed approach is just slightly lower than that of the twopoint correlation function based approach. One may expect that the error evaluated using the twopoint correlation function should be the smallest when the twopoint correlation approach is used for reconstruction because the metric is directly used as an objective. However, the twopoint correlation function based reconstruction uses simulated annealing, which yields difficulty in converging to the global minimum, leading to poorer performance. Moreover, while the reconstructions from the GRF and twopoint correlation function based approaches on the copolymer material system achieve a relatively low error rate (0.85% and 1.20%, respectively), those reconstructions are visually different from the original microstructure. This again verifies Torquato’s proposition^{37} that twopoint correlation only partially reveals the statistical equivalence of the original microstructure and its reconstructions. Finally, the error rates for the polymer composite material system are observed to be higher than those of other systems. For the lowloading polymer nanocomposite material system, the values of S_{2} in Eq. 1 are lower than those for the other material systems studied in this work. Therefore, a slight difference between the correlation functions (S_{1}) would lead to a significantly larger error rate.
Table 2 illustrates the error rate evaluated using the linealpath correlation function. Three major findings are summarized from this comparison. (1) The transfer learningbased approach achieves a low error rate (<8%) in all five samples, while the performance of the other four methods varies significantly across different material systems. (2) While the twopoint correlation function based approach reaches very low error rates in Table 1, its error rates of the linealpath function is very large. This result is reasonable since the twopoint correlation function based approach applies pixelswitching in its reconstruction process, but the connectivity in the clusters is not guaranteed. (3) While the transfer learningbased approach demonstrates superiority in terms of generality, the decisiontree based approach is a very competitive also achieves very low error rates in three of the five samples.
While the error rates of the proposed transfer learningbased convolutional network approach and the twopoint correlation based approach are very close in a few cases (e.g. copolymer), their reconstructions could significantly differ from visual inspection. This again implies that while lowerorder statistical functions can capture lowerorder statistical equivalence, highorder metrics are needed to completely assess the statistical equivalence.
Since both the twopoint correlation function and linealpath correlation function are loworder representations of microstructures, they do not fully capture the highorder characteristics of the original and reconstructed microstructures. To this end, we also visually inspected the reconstructions of different material systems (Fig. 2) and compared our findings to the results in Tables 1 and 2. In general, the visual similarity between the original microstructures and the reconstructed ones agrees with the error rates in Tables 1 and 2, with the exception of the block copolymer reconstruction using the twopoint correlation function based approach. In this case, the reconstruction achieves 1.20% error rate in the twopoint correlation function and 8.09% error rate in the lineal path correlation function. However, the reconstruction looks like a random white noise image by visual inspection. This finding again confirms Torquato’s proposition^{37} that loworder statistical functions are not capable of representing the microstructure completely.
In addition to demonstrating the advantages through quantitative comparative studies (Fig. 2 & Tables 1 and 2), we demonstrate the versatility of the proposed approach through analyzing complex microstructures, such as those in block copolymer and 3phase rubber composite samples. As illustrated in Fig. 3, the original fingerprintshaped copolymer sample has very different local anisotropy at different locations, whereas the global isotropy holds. The proposed deep convolutional networkbased approach can accurately reproduce this characteristic in its reconstruction, while the decision treebased approach generates a diagonally oriented anisotropic reconstruction.
Advantages of the proposed approach are also demonstrated in Fig. 4 by analysis of a rubber composite sample that consists of two rubber phases (Butadiene rubber (BR, white) and StyreneButadiene rubber (SBR, blue)) with one filler phase (carbon black (CB, cyan)), at two different carbon black compositions. Given the multiphase nature of this material, the statistical equivalence of the original microstructures (Fig. 4(a,c)) and their reconstructions (Fig. 4(b,d)) is evaluated using the twopoint correlation function in the onevsrest manner: specifically, the threephase microstructure is first binarized into three binary images (BR vs. the rest, SBR vs. the rest and CB vs. the rest). Then the correlation function is applied to the binary images in order to validate the statistical equivalence. Using this method, the statistical equivalence of the original microstructures and their reconstructions are validated (Table 3).
Numerical pruning and understanding the network model hierarchy
While it is shown that the proposed approach is capable of reconstructing statistically equivalent microstructures accurately for a wide range of material systems, the application of the proposed approach is potentially limited because of its high computational cost (primarily GPU memory consumption and the number of operations in backpropagation.) In our experiments, the loading of full VGG19 model consumes 11541MB GPU memory on a Nvidia GeForce Titan Xp graphic card and leads to a significant amount of backpropagation operations. Therefore, in this section, model reduction is studied by eliminating some network layers to increase computation efficiency and viability. Noticing that, different computational platform may have very different computational performance for the same model, in this study, the number of weight parameters is used to measure the model complexity.
The model pruning in this work is achieved by first gradually eliminating highlevel layers, followed subsequently by elimination of lowlevel layers from the transferred deep convolutional network. This sequence in layer removal not only keeps the lower part of the network architecture intact, but also aligns with the understanding of network architectures presented by Yosinski et al.^{36} In other words, higherlevel layers are likely to be highlevel concept discriminators for specific tasks. Thus, their elimination may impact the reconstruction less significantly than the removal of lowerlevel layers (i.e., the ones close to the image), which are usually interpreted as general local feature extractors similar to Gabor filters^{42} or color blobs. In the vanilla version of the proposed approach, the first convolutional layer and the first four pooling layers are included in the loss function. The inclusion of these five layers essentially requires the loading of the first 12 convolutional layers, which introduce 10,581,696 parameters (not counting the biases). The removal of the highest pooling layer (pooling_4 in VGG19) of the five layers reduces the number of included convolutional layers to 8, which have 2,324,160 parameters (21.96% of the previous one). Further elimination of the other pooling layers (pooling_3, pooling_2 and pooling_1) would reduce the number of convolutional layers to 4, 2 and 1 respectively, which corresponds to 259,776/38,592/1,728 weight parameters (2.45%, 0.36% and 0.02% of the first one), respectively. Figure 5 illustrates the reconstructions using different selections of layers. From this comparison, we find three important results. (1) The elimination of the highest pooling layer has an insignificant impact on the reconstruction results. This result is expected, as those higher pooling layers are discriminators specifically tuned for the original AI task (i.e. image classification for ImageNet dataset). (2) From the comparison (Fig. 5A3,A4,B3 and B4), the removal of the third pooling layer results in the loss of longdistance dispersion equivalence. Specifically, in Fig. 5(A4) global variation of local anisotropies is lost, while variation of clustercluster distances is decreased in Fig. 5(B4). (3). From the comparison (Fig. 5A4,A5,B4 and B5), the elimination of the lowest pooling layer leads to significant loss of shortdistance (local morphological) equivalence. Our observations further validates our hypothesis that in transferring deep learning models, the highest neural network layers (i.e. layers higher than pooling_3) may be eliminated because they are discriminators for the original ImageNet^{31} image classification task but are not useful for microstructure reconstruction. In contrast, network layers lower than pooling_3 need to be retained to keep dispersive characteristics in the reconstructed microstructure. Figure 5(C,D) illustrates the reconstruction error rates (Eqn. 1) computed using twopoint correlation function and linealpath correlation function. It is observed that removal of pooling_3 and pooling_2 would not affect the reconstruction accuracy significantly. Noticing that neither of the two correlation functions could fully capture microstructure characteristics, in determining the optimal pruned network architecture, we still retain the layers that are necessary for both visual similarity and statistical equivalence (i.e. layers lower than pooling_3).
In addition to the numerical study illustrated above, the model pruning is also analyzed from the perspective of receptive fields. A receptive field is a significant concept in deep convolutional networks and is defined as the region in the input space that influences a particular convolutional neural network feature. As all the convolutional filters in the VGG19 model are 3 × 3 pixels, it is relatively straightforward to compute the receptive fields for each layer (Table 4). The sizes of the receptive fields could be interpreted as follows: for the lowest convolutional layer (conv_11), varying each entry of its output can affect a small region of 3 × 3 pixels, while altering the output of the pooling_4 layer leads to the influence of a large area of 160 × 160 pixels (the full microstructures in this work are 256 × 256 pixels). The sizes of the receptive fields for each layer also reveal their individual roles in controlling the microstructure reconstruction. Specifically, higherlevel layers (e.g. pooling_3) control the longdistance dispersion in the microstructure, while lowerlevel layers (e.g. conv_11 and pooling_1) specify local geometries. This again validates our findings in the comparison of reconstructions in Fig. 5. For the two material systems in Fig. 5, a 72 × 72 pixel window from the microstructure is capable of capturing most of the statistical characteristics; therefore, layers higher than pooling_3 could be eliminated while retaining the quality of the microstructure reconstruction.
Structureproperty Prediction
In addition to using deep learning for microstructure reconstruction, a natural extension is to employ the architecture of deep convolutional networks for analyzing structureproperty relationships of advanced materials. It has been found by Yosinski et al.^{36} and LeCun et al.^{21} that transfer learning (i.e. using pretrained weights to initialize the network) improves the stability and accuracy of the predictive model. Despite these successes, there is no rule to determine how many layers to transfer, thereby introducing subjective choice. Typically, the inclusion of more pretrained layers increases the flexibility of the network model but increases the associated computational cost and the likelihood of overfitting. To resolve this issue, the model pruning investigated in the previous section is used to identify the necessary pretrained layers for describing dispersive characteristics in microstructures and thus provides a guideline for this determination. Specifically, we propose the general rule of determining the number of transferred layers as follows: the remaining layers in the pruned microstructure reconstruction model are regarded as necessary ones to describe microstructure characteristics; therefore, they also need to be adopted in developing structureproperty predictive models. In the context of VGG19 models, all the layers beyond pooling_3 are discarded in pruning and the remaining 8 convolutional layers and 3 pooling layers are utilized to initialize the neural network for structureproperty predictions.
To demonstrate the effectiveness of this approach, a numerical study is conducted to develop a structureproperty predictive model for optical microstructural materials. 250 microstructures of size 128 × 128 pixels are generated using the Gaussian Random Field approach^{9}, and their corresponding optical absorption properties (a scalar value between 0 and 1) are simulated using the Rigorous Couple Wave Analysis^{43} (RCWA) simulation code. RCWA is a Fourierdomainbased algorithm that can solve the scattering problems for both periodic and aperiodic structures. Details of RCWA modeling could be found in^{44,45}.
This dataset is split into 200 and 50 microstructures for training and testing, respectively. Figure 6 shows several examples of the generated microstructures. The architecture of the neural network is constructed using the layers lower than the pooling_3 layer in VGG19. The output of the pooling_3 layer is flattened, followed by two fully connected layers of size 2048 pixels and 1024 pixels with Rectified Linear Unit (ReLU) and dropout (p = 0.5) operations. The weights in the transferred layers are initialized using the pretrained weights in the VGG19 model while the remaining ones are initialized randomly. Two additional experiments are also conducted as control groups: group 1 – instead of adopting layers lower than pooling_3, we only transfer the ones below pooling_2, and the rest of the settings are kept the same as the proposed approach; group 2 – layers lower than pooling_4 are adopted, and the same settings for fully connected layers are used. Adam optimizer (learning rate = 0.0005, beta1 = 0.5, beta2 = 0.99) is used to finetune the parameters. The minibatch size is set as 50 and the number of epochs is 4,000. For each group, the training is repeated 15 times to investigate accuracy and stability.
Figure 7 shows the results of the proposed approach and two control groups. Comparing to the proposed approach, the control group 1 is observed to be underfitted (i.e., higher mean error and larger variance of the error), while the control group 2 shows a higher likelihood of overfitting (i.e., more outliers with large associated error). This comparison validates the significance of the knowledge learned from microstructure reconstruction model pruning, and it also validates the proposed structureproperty modeling approach.
Conclusion
Microstructure reconstruction and structureproperty prediction are two challenging but advantageous tasks in computational materials science. In this work, a transfer learning based approach for reconstructing microstructures is first proposed. A comprehensive comparison of results for multiple materials between the proposed approach and existing approaches is conducted to demonstrate the accuracy and generality of the proposed approach. To reduce the computational cost, the transferred deep convolutional network is pruned, and the understanding of the correspondence between neural network layers and long/shortrange dispersions in microstructures are drawn by visual inspection and analyzing the receptive fields. The knowledge learned in model pruning also guides the determination of the pretrained layers to be transferred in developing structureproperty predictive models. In summary, the proposed approach provides an endtoend – i.e. imagetoimage for reconstruction or imagetoproperty for property prediction – and offtheshelf solution which generalizes well and requires minimum prior knowledge of material systems.
Despite the advantages demonstrated in this work, the present approach has some potential limitations. For instance, while the Grammatrix matching ensures the statistical equivalence in stochastic microstructure reconstructions, it is not guaranteed to be applicable in deterministic microstructures such as periodic crystallographic structures. The reconstruction of these deterministic microstructures may be handled by adding customized loss function terms to the proposed approach. In addition, the adoption of pretrained ImageNet deep convolutional network implicitly constraints the application of the proposed approach on 2D microstructures. 3D microstructure reconstruction tasks may be solved by extending the proposed transfer learning strategy using existing 3D convolutional network models.^{46} From the perspective of deep learning, advanced deep learning models such as ResNet^{47} may further improve the results.
Method
Encoding via Maximizing Minimum Distance
Microstructures are typically represented as N × M matrices (where N and M correspond to the height and width of the microstructure image, respectively). The first step is to convert the N × M matrices to the 3channel representations that can be fed into the transferred deep convolutional model. While there are a variety of mapping methods for this conversion, here we suggest an encoding strategy that maximizes the minimum Euclidian distance between the encoded phase coordinates. This encoding strategy is chosen for the ease of distinguishing individual phases after the gradientbased optimization for reconstruction in the encoded space is carried out.
Maximization of the minimum distances between a number of points in the feature space has been solved typically by gradientbased search algorithms or stochastic search algorithms such as simulated annealing^{48}. It can also be formulated as an NPcomplete, independence in geometric intersection graphs problem, which can be addressed by approximate algorithms^{33}. However, given that in most cases, the number of different material phases in an original microstructure is not large, it is not necessary to pursue the farthest separations as long as the phase clusters after reconstruction can be properly distinguished. Therefore, for material systems which have no more than three (3) material phases such as in this work, we take a simpler approach – Latin Hypercube Sampling^{49} (LHS). Specifically, by setting the number of sampling points to be equal to that of distinct material phases in the original microstructure, LHS samples with maximin distance criteria would create a 3vector representation for each material phase. Then, for each pixel in the N × M microstructure, we replace the original scalar phase label by the 3vector, which leads to N × M × 3 matrix representations.
Deep Convolutional Characterization and Reconstruction
While a lot of models have been developed for the ImageNet task such as GoogleNet^{50} and ResNet^{47}, the VGG19 model^{30} is selected in this work because of its structural simplicity and regularity. The original VGG19^{30} model has 19 layers (3 fully connected layers and 16 convolutional layers). In transferring this model, all layers beyond the 2^{nd} highest pooling layer are first eliminated (i.e. 1 fullyconnected layer, 1 pooling layer and 3 convolutional layers). Both the network structure and the network parameters from the VGG19 model are inherited as the transferred deep convolutional model in this work.
Microstructure reconstruction using the transferred deep convolutional model is essentially a gradientbased optimization process. The objective function to be minimized is the sum of Grammatrix differences on the selected neural network layers, and the variables to be optimized are the pixel values in the microstructures. The optimization process can be decomposed into three steps: (1) Initialization: an N × M × 3 matrix is initialized randomly with uniform distribution for each entry in the microstructure. Different initializations will result in different statistically equivalent microstructure reconstructions. (2) Forwardpropagation: At each iteration of optimization, the values in N × M × 3 representations of the original and the reconstruction are forwardpropagated simultaneously through the deep learning network, creating corresponding activation values on each layers. (3) Backpropagation: Grammatrix^{32} on selected layers are matched between the reconstruction and the original microstructure to find the difference (i.e., loss). The gradient of the loss with respect to each pixel in reconstruction is then computed via backpropagation using GPU, and it is then fed into a nonlinear optimization algorithm to update the pixel values of the microstructure reconstruction. Steps (2) and (3) are then executed iteratively until the solution converges to a local optimal state of the microstructure reconstruction.
The convolutional deep neural network in the present approach is composed of two sets of computation units: regular units (convolutional operation, Rectified Linear Unit transformation, and pooling operation) and customized units (Grammatrix related computations). While the backpropagation of regular units are well integrated in the popular deep learning platforms, the Grammatrix related derivations are still needed for the implementation in customized units. Here we demonstrate the derivation briefly. Let \({\boldsymbol{x}}\) and \(\tilde{{\boldsymbol{x}}}\) denote the original and reconstructed microstructure in the encoded space at iteration #n, respectively. \({\boldsymbol{x}}\) and \(\tilde{{\boldsymbol{x}}}\) are first passed through the transferred convolutional network for activating feature maps F^{i} of layer i. Then in each layer i of the network, \({\boldsymbol{x}}\) and \(\tilde{{\boldsymbol{x}}}\) will activate a stack of feature maps, \({F}^{i},{\tilde{F}}^{i}\in { {\mathcal R} }^{{N}_{i}\times {M}_{i}}\) where N_{i} is the number of filters and M_{i} is the size of the vectorized feature maps in layer i. Let \({F}_{jk}^{i},\,{\tilde{F}}_{jk}^{i}\) denote the activations of the j^{th} filter at position k in layer i for \({\boldsymbol{x}}\) and \(\tilde{{\boldsymbol{x}}}\). The Gram matrix^{32} of both microstructures is defined as the inner product between feature map p and q in layer i:
The contribution of the loss in layer i is:
The total loss is:
Next, a gradientbased optimization with with aim of minimizing the total loss between the original and the reconstructed microstructures is utilized in order to update the reconstructed microstructure. The gradient \(\frac{\partial {\rm{L}}}{\partial {\rm{x}}}\) is decomposed by the chain rule as:
where
and \(\frac{\partial \tilde{{\rm{F}}}}{\partial {\rm{x}}}\) is automatically handled by backpropagation in Caffe. The gradient \(\frac{\partial {\rm{L}}}{\partial {\rm{x}}}\) is then fed into the Limitedmemory BroydenFletcherGlodfarbShanno algorithm^{34} with bound constraints of [0, 255] (LBFGSB) or Adam optimizor^{35} on each encoded dimension to minimize the total loss L. At the end, convergence at local minimum is achieved and each pixel will be assigned a 3vector by LBFGSB. In other words, at the end of the optimization, a N × M × 3 matrix will be obtained for the next step of decoding.
Decoding Reconstruction via Unsupervised Learning and Simulated Annealing
The reconstructed microstructure in the encoded space obtained from the previous step is essentially an N × M × 3 matrix with each entry ranging from 0 to 255. To generate a microstructure image that is compatible to further numerical analysis such as Finite Element simulations, it is critical to convert the N × M × 3 image back to the N × M image with each pixel appropriately labelled with material phases. In other words, for each location in the microstructure, its current representation of 3vector needs to be replaced by a scalar label that indicates material phase. Since the reconstruction in the encoded space obtained by LBFGSB or Adam optimization is a local minimum, there is no guarantee that the 3vectors of the reconstructed pixels are still exactly at the coordinates we sampled in the encoding step. Nevertheless, it is observed that 3vectors of pixels for the same material phase are still clustered. Hence, we apply an unsupervised learning approach (Kmeans clustering) to separate the reconstructed pixels into K groups, where K is the number of material phases counted in the encoding process.
It should be noted that Kmeans clustering does not enforce the ratio of pixels’ partition for each material phase, so it is possible that the volume fraction of each cluster is slightly different from that of the original microstructure. Considering that volume fraction (VF) is a key feature for material systems, such as polymer composites or carbonates, the last step of the algorithm is to compensate the discrepancy of VF between the original and the reconstructed microstructures by switching pixels’ phase label on the boundary. Herein we utilize a stochastic optimization approach, simulated annealing (SA), to match the phase VFs with those in the original microstructure.
Data Availability Statement
The source code of this work will be made available upon request to the corresponding author.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Executive Office of the President, N. S. a. T. C. Materials Genome Initiative for Global Competitiveness. (2011).
 2.
Pilania, G., Wang, C., Jiang, X., Rajasekaran, S. & Ramprasad, R. Accelerating materials property predictions using machine learning. Sci. Rep. 3, 2810 (2013).
 3.
Olson, G. B. Computational design of hierarchically structured materials. Science 277, 1237–1242 (1997).
 4.
Wang, Y. et al. Identifying Interphase Properties in Polymer Nanocomposites using Adaptive Optimization. Compos. Sci. and Technol. (in press) (2018).
 5.
Liu, R. et al. A predictive machine learning approach for microstructure optimization and materials design. Sci. Rep. 5 (2015).
 6.
MannodiKanakkithodi, A., Pilania, G., Huan, T. D., Lookman, T. & Ramprasad, R. Machine learning strategy for accelerated design of polymer dielectrics. Sci. Rep. 6 (2016).
 7.
Torquato, S. Random heterogeneous materials: microstructure and macroscopic properties. Vol. 16 (Springer Science & Business Media, 2013).
 8.
Xu, H., Dikin, D. A., Burkhart, C. & Chen, W. Descriptorbased methodology for statistical characterization and 3D reconstruction of microstructural materials. Comput. Mater. Sci. 85, 206–216 (2014).
 9.
Jiang, Z., Chen, W. & Burkhart, C. Efficient 3D porous microstructure reconstruction via Gaussian random field and hybrid optimization. J. Microsc. 252, 135–148 (2013).
 10.
Bostanabad, R., Bui, A. T., Xie, W., Apley, D. W. & Chen, W. Stochastic microstructure characterization and reconstruction via supervised learning. Acta Mater. 103, 89–102 (2016).
 11.
Liu, X. & Shapiro, V. Random heterogeneous materials via texture synthesis. Comput. Mater. Sci. 99, 177–189 (2015).
 12.
DeCost, B. L. & Holm, E. A. A computer vision approach for automated analysis and classification of microstructural image data. Comput. Mater. Sci. 110, 126–133 (2015).
 13.
Chowdhury, A., Kautz, E., Yener, B. & Lewis, D. Image driven machine learning methods for microstructure recognition. Comput. Mater. Sci. 123, 176–187 (2016).
 14.
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
 15.
Chen, C. L. et al. Deep learning in labelfree cell classification. Sci. Rep. 6 (2016).
 16.
Litjens, G. et al. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci. Rep. 6 (2016).
 17.
Pinaya, W. H. L. et al. Using deep belief network modelling to characterize differences in brain morphometry in schizophrenia. Sci. Rep. 6 (2016).
 18.
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks, Paper presented at Advances in Neural Information Processing Systems Conference, Lake Tahoe, NV, Proceedings of Advances in Neural Information Processing Systems 25, 1097–1105. (NIPS, 2012).
 19.
Socher, R., Huang, E. H., Pennington, J., Ng, A. Y. & Manning, C. D. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection, Paper presented at Advances in Neural Information Processing Systems Conference, Granada, Spain, Proceedings of Advances in Neural Information Processing Systems 24. 801–809 (NIPS, 2011).
 20.
Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
 21.
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
 22.
Cang, R. et al. Microstructure Representation and Reconstruction of Heterogeneous Materials via Deep Belief Network for Computational Material Design. arXiv preprint arXiv:1612.07401 (2016).
 23.
Li, X. et al. A deep adversarial learning methodology for designing microstructural material systems. ASME 2018 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Design Automation Conference, Quebec City, Quebec, Canada (ASME,) (paper accepted 2018).
 24.
Glorot, X., Bordes, A. & Bengio, Y. Domain adaptation for largescale sentiment classification: A deep learning approach, Paper presented at the 28th international conference on machine learning, Bellevue, WA, Proceedings of the 28th international conference on machine learning (ICML11). 513–520 (ICML, 2011).
 25.
Goodfellow, I., Mirza, M., Courville, A. & Bengio, Y. Multiprediction Deep Boltzmann Machines, paper presented at Advances Neural Information Processing Systems Conference, Lake Tahoe, NV, Proceeding of Advances Neural Information Processing Systems 26, 548–556 (NIPS, 2013).
 26.
Bengio, Y. Deep learning of representations for unsupervised and transfer learning. Paper presented at the 28th international conference on machine learning Unsupervised and Transfer Learning workshop, Bellevue, WA. Proceedings of ICML Unsupervised and Transfer Learning workshop, 17–36 (ICML, 2012).
 27.
Torrey, L. & Shavlik, J. Handbook of Research on Machine Learning Applications (IGI Global, 2009).
 28.
DeCost, B. L., Francis, T. & Holm, E. A. Exploring the microstructure manifold: image texture representations applied to ultrahigh carbon steel microstructures. Acta Mater. 133, 30–40 (2017).
 29.
Lubbers, N., Lookman, T. & Barros, K. Inferring lowdimensional microstructure representations using convolutional neural networks. Physical Review E 96, 052111 (2017).
 30.
Simonyan, K. & Zisserman, A. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556 (2014).
 31.
Deng, J. et al. Imagenet: A largescale hierarchical image database, Paper presented at Computer Vision and Pattern Recognition, Miami Beach, FL. Computer Vision and Pattern Recognition, 248–255 (IEEE, 2009).
 32.
Gatys, L., Ecker, A. S. & Bethge, M. Texture synthesis using convolutional neural networks, paper presented at Advances in Neural Information Processing Systems Conference, Montreal, Canada. Proceedings of Advances in Neural Information Processing Systems, 262–270 (NIPS, 2015).
 33.
Chan, T. M. & HarPeled, S. Approximation algorithms for maximum independent set of pseudodisks. Discrete & Computational Geometry 48, 373–392 (2012).
 34.
Byrd, R. H., Lu, P., Nocedal, J. & Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing 16, 1190–1208 (1995).
 35.
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
 36.
Yosinski, J., Clune, J., Bengio, Y. & Lipson, H. How transferable are features in deep neural networks? Paper presented at Advances in neural information processing systems Conference, Montreal, Canada, Proceedings of Advances in neural information processing systems, 3320–3328. (NIPS, 2014).
 37.
Torquato, S. Necessary conditions on realizable twopoint correlation functions of random media. Ind. Eng. Chem. Res. 45, 6923–6928 (2006).
 38.
Lu, B. & Torquato, S. Linealpath function for random heterogeneous materials. Phys. Rev. A 45, 922 (1992).
 39.
Borbely, A., Csikor, F. F., Zabler, S., Cloetens, P. & Biermann, H. Threedimensional characterization of the microstructure of a metal–matrix composite by holotomography. Mater. Sci. Eng. A 367, 40–50 (2004).
 40.
Grigoriu, M. Random field models for twophase microstructures. J. Appl. Phys. 94, 3762–3770 (2003).
 41.
Liu, Y., Greene, M. S., Chen, W., Dikin, D. A. & Liu, W. K. Computational microstructure characterization and reconstruction for stochastic multiscale material design. Comput. Aided. Des. 45, 65–76 (2013).
 42.
Hamamoto, Y. et al. A Gabor filterbased method for recognizing handwritten numerals. Pattern Recognition 31, 395–400 (1998).
 43.
Yu, S. et al. Characterization and Design of Functional QuasiRandom Nanostructured Materials Using Spectral Density Function. Journal of Mechanical Design 139(7), 071401 (2017).
 44.
Li, L. New formulation of the Fourier modal method for crossed surfacerelief gratings. JOSA A 14, 2758–2767 (1997).
 45.
Moharam, M., Pommet, D. A., Grann, E. B. & Gaylord, T. Stable implementation of the rigorous coupledwave analysis for surfacerelief gratings: enhanced transmittance matrix approach. JOSA A 12, 1077–1086 (1995).
 46.
Ji, S., Xu, W., Yang, M. & Yu, K. 3D convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence 35, 221–231 (2013).
 47.
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition, paper presentd at the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778. (IEEE, 2016).
 48.
Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).
 49.
McKay, M. D., Beckman, R. J. & Conover, W. J. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 42, 55–61 (2000).
 50.
Szegedy, C. et al. Going deeper with convolutions, arXiv preprint arXiv:1409.4842 7 (2015).
Acknowledgements
This work is supported by Center for Hierarchical Materials Design (ChiMaD NIST 70NANB14H012), Design of Engineering Material Systems (DEMS CMMI1334929(NU), CMMI1333977 (RPI)) and the Goodyear Tire and Rubber Company.
Author information
Author notes
Affiliations
Contributions
W. Chen, X. Li, Y. Zhang and C. Burkhart formulated the research problem by identifying the primary challenges and applications. X. Li and Y. Zhang then developed the idea of the present work and proposed the workflow. Code implementation was done by X. Li with the help of H. Zhao and Y. Zhang on machine configuration, Linux setup and sample collection. Design of numerical experiments were developed by continuous discussion between X. Li, Y. Zhang, C. Burkhart, W. Chen and L. C. Brinson. X. Li drafted the manuscript and continuously improved the writing with the help from Y. Zhang, H. Zhao, C. Burkhart, W. Chen and L.C. Brinson. All authors reviewed the manuscript and contributed to the revision.
Competing Interests
The authors declare no competing interests.
Corresponding author
Correspondence to Wei Chen.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Received
Accepted
Published
DOI
Further reading

Transfer learning of deep material network for seamless structure–property predictions
Computational Mechanics (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.