The discovery of processes for the synthesis of new materials involves many decisions about process design, operation, and material properties. Experimentation is crucial but as complexity increases, exploration of variables can become impractical using traditional combinatorial approaches. We describe an iterative method which uses machine learning to optimise process development, incorporating multiple qualitative and quantitative objectives. We demonstrate the method with a novel fluid processing platform for synthesis of short polymer fibers, and show how the synthesis process can be efficiently directed to achieve material and process objectives.
Experimentation is central to the discovery and development of processes for the synthesis of novel materials. This involves a myriad of decisions for the system configuration, operating parameters and material properties, described by multiple variables. As the multi-variable complexity of new processes and materials increases, the task of identifying the best combination of process settings to achieve output material property targets becomes complex and time consuming. Design of Experiment (DoE) methodologies1,2,3 (including Response Surface Analysis) have for decades provided experimentalists with a framework to systematically explore multi-variable space, however the experimental costs can become burdensome as complexity increases.
Challenges posed by materials processing optimisation are manifold. Material and process variables must be combined to search for optimal target characteristics. This is complex since the variables relating to material properties and processes are often non-homogeneous or non-orthogonal. Further, multiple diverse objectives need to be incorporated; these could be properties evaluated through quantitative measures, or otherwise be specified qualitatively due to need or difficulty of attaining quantitative values - for example morphological homogeneity or presence of by-products. Cost, performance, and quality metrics are typical for industrial systems. Finally, while existing databases of materials and properties may provide varying degrees of guidance for synthesis of novel materials, experimenters in the early stages of process development and optimisation rely more upon DoE approaches, hypothesised governing principles, broad approximations and expert intuition.
In chemistry and materials science, computational methods such as Density Functional Theory (DFT) can be used to predict interactions at the atomic scale. In combination with such methods, machine learning models have been used to predict material properties, model inter-atomic potentials, predict crystal and lattice structures, and understand property-structure relationships in amorphous materials4. Using an iterative design process, different material compositions have been formulated and evaluated computationally in order to optimise particular physical properties, such as the elastic modulus of M2AX compounds5, or thermal conductance across nanostructures6. Complex systems such as particle-reinforced composites have recently been approached through multistep modelling7 which demonstrates efficient tuning of materials properties and composition through micro-mechanical and multiscale models.
Computational evaluations are usually easier to perform than physical experiments, allowing many more iterations to be done in the search process. In the absence of detailed knowledge of a material synthesis process, and in the absence of sensitive predictive tools like DFT, finite element analysis, and computational fluid-dynamics, the optimisation of a physical process is best approached as a black-box problem, to reduce the impact of possibly-incorrect or inaccurate assumptions. When materials must be physically synthesised, efficient global optimisers are preferred. These have been recently used to design new shape memory alloys from a large search space of compositions8, and to design experiments for the production of Bose-Einstein Condensates9.
The computational challenge in optimisation of novel processes arises because the mathematical relationship between the control variables and the target is often unknown - it is a classic “Black-Box Function”. Experimental data is expensive to acquire, so achieving an optimal setting with a minimum of experiments is desirable. Bayesian optimisation10,11,12 is a machine learning framework to optimise expensive black-box functions, which it models through a stochastic Gaussian process (GP)13. The GP may be used to mimic an experimenter, building a mathematical model from available experimental data, then using its model it derives an “educated guess” to recommend the next experimental setting. The GP is updated as experiments proceed - the “educated guess” becomes increasingly accurate, and its epistemic uncertainty reduces. In contrast, there could be other sources of uncertainty, such as those stemming from uncertain inputs or from a stochastic system. Such uncertainties are aleatoric in nature and it is important to perform sensitivity analysis14 of the result. Bayesian optimisation is theoretically shown to have the best achievable order of convergence rate in terms of number of samples (sub-linear growth in cumulative regret) for a global optimisation algorithm15 and it allows integration of different output types such as numerical, pairwise quality measure and even their hybrid. It also offers high dimensional optimisation16, optimisation with unknown constraints17, and even transfer learning from past experiments18 in an efficient optimisation framework. While Bayesian optimisation offers a powerful construct for adaptive experimentation5, limited work exists8, 9 and does not tackle the key challenges of incorporating multiple objectives, or combining heterogeneous objectives.
In this contribution, we demonstrate that machine learning techniques can be used to effectively optimise developmental stage processes for synthesis of novel materials. We formulate a method based on Bayesian optimisation to help navigate experimental complexity by integrating an iterative experimental approach with data-driven models. Using both material and process-related variables as inputs, this Adaptive Experimental Optimisation (AEO) framework recommends the next experimental setting, followed by performance of the experiment and return of the data values to the algorithm in an iterative cycle until the target product goal is achieved. Multiple objectives that can be both quantitative and qualitative are accounted for in the framework.
We test the algorithm using a modular fluid processing platform to produce short (and ultrafine) polymer fibers19, 20, a novel physical process which has not previously been modelled or fully characterised. Short polymer fibers (SPF) are filamental micro structures with typical dimension magnitudes of 1–10 μm diameter and 10–1000 μm in length. SPF are produced by coagulating a polymer solution in a shear flow21. The fiber diameter and length are adjustable by varying the synthetic conditions, but little is known about the inter-parameter relationships in the system. In the new method described in this paper, a fiber is formed in a fluidic device by injecting a polymer solution coaxially into a rapidly flowing coagulant liquid (Fig. 1). The coagulant flows through a channel of defined width (h) and the polymer solution is injected at the centre line of the channel. The flow is then funnelled into a 1 mm wide output channel through a constriction of variable angle (α). The distance between polymer injection and the start of the constriction (d) has a direct effect on the residence time of the polymer filament in the chamber before it is accelerated, and possibly elongated, within the constriction. The many degrees of freedom of this system, the dimensions (micron and submicron), the varied morphology of the products (from fibres, through to spheres and debris), and the complexity of the flow make modelling this system very challenging and of low accuracy. These features make this platform suitable for optimisation through AEO.
The main contributions of this paper are:
Formulation of Adaptive Experimental Optimisation (AEO) incorporating both material properties and process characteristics within a coherent experimental optimisation framework;
Incorporation of multiple objectives for experimental optimisation, including both qualitative and quantitative targets;
Systematic methodology to objectively draw out insights about qualitative targets;
Description of a novel process for synthesis of short polymer fibers (SPF);
Validation of the AEO for optimisation of SPF materials and associated process.
Short Polymer Fiber Synthesis
The fluidic devices were constructed with various combinations of the three geometric variables shown in Fig. 1, namely channel width (h = 3, 6 or 9 mm), injection position (d = 0, 15 or 30 mm) and constriction angle (α = 10° or 25°), giving a total of 18 different configurations.
The coagulant liquid used was 1-butanol (>99%, Chem Supply ) maintained at a temperature of 4–10 °C and driven by a gear pump (Micropump GB-P23.KF5SA) connected to a variable power supply (TTiEX355) at a maximum current of 1 Amp. The voltage of the power supply was adjusted to control the coagulant flow rate. Poly(ethylene-co-acrylic acid) pellets (Primacor 5990I DOW ) were added to an aqueous solution of de-ionised water and ammonium hydroxide under reflux at 110 °C, to prepare a 16.5% w/v dispersion (polymer dope). This polymer dope was pumped into the device using 5 ml Terumo syringes mounted on a syringe pump (Legato 270). For each experiment the device was initially flushed with 50–100 ml of water through the coagulant inlet, then the polymer flow was initiated and once the polymer reached the nozzle (judged through observation of the flow through transparent tubing) the coagulant flow was started. After reaching a stable flow (typically 10 s), approximately 30 ml of produced suspension was collected in a glass beaker. Both polymer and coagulant flows were then stopped and the device flushed with water to avoid coagulation in the channels. The collected suspension was diluted in ethanol and a few drops were deposited on a glass slide and dried in air. Three values were explored for polymer flow rate (f = 80, 110, or 140 ml/h) and coagulant speed (v = 43, 68, 93 cm/s). In combination with geometric variables there are thus 162 possible experiments.
Microscopy images of the dry samples were taken at different magnifications using an optical microscope (Olympus BX51 with a DP71 camera) calibrated using a TEM 462 nm grid and an optical 10 μm calibration slide. ImageJ (Fiji distribution) was used to analyse the images for measurement of fiber dimensions. Fiber length was measured using the multi-line tool at 5x and 20x magnifications, while the single line tool was used for measurements of diameter at 100x magnification. High magnification and low field of view prevented the simultaneous measurement of length and diameter for each fibre, so separate length and diameter distributions were used. The median values of the measured length and diameter were extracted for each image. For length and diameter, a minimum of 50 or as many measurements as possible (if fewer fibers found) were taken per image, and a few images were processed per sample. The qualitative assessment used for the pairwise comparison ranking (see “Qualitative Score - f Q (·) and interface for qualitative comparison”) was performed with the objective to emphasise fiber suspensions with less by-product (debris and spheres) and less entangled fibers.
Adaptive Experimental Optimisation
The role of AEO is to guide the experimenter by suggesting the next experimental setting based on Bayesian optimisation with the objective to reach the target output properties. Targets are specified and the experimentally available range of both process and product parameters are stipulated. In our case, the product parameters (y) are median length, median diameter and quality of fibers; and the process parameters (x) are position, constriction angle, channel width, polymer flow and solvent flow.
Figure 2 describes the overall iterative optimisation process. Step 1 models an unknown function with a Gaussian process using the data acquired thus far. Step 2 recommends the next experimental setting (x t ) by optimising an acquisition function derived from the Gaussian process. In Step 3 the experimentalist performs the suggested experiments and measures the output characteristics, as described in the Methods section. Step 4 uses these measurements to compute the combined target utility y t (Equation (10)). This new observation pair (x t , y t ) is appended to the data set. Step 5 checks whether the product meets the target specifications. Steps 1–5 are repeated in an iterative loop until the target is achieved or process is terminated. Steps 1, 2 and 4 are described in the following sections.
Step 1: Function modelling
Bayesian optimisation is a powerful tool to optimise an expensive black-box function and we use it to recommend the next experiment setting based on the observations , where are the objective values computed via experimentation by evaluating the results. In our case a multi-objective function combines distance to target length and diameter and the qualitative score via Equation (10). A common method to model an unknown function f is using a Gaussian process as a prior, that is, f(·)~GP(0, K(·,·)). The prior probability of function values is a multivariate Gaussian where
and K is the t × t covariance matrix whose ij-th element is k(x i , x j ). The k is a kernel function. The kernel function computes the distance between data points. A common kernel function is the squared exponential (SE) function, which is defined as
where the kernel length-scale θ affects the smoothness of the objective function - a larger value means a smoother function.
When a new observation point x t+1 is made, the predictive distribution of y t+1 can be computed as
with and , where .
Step 2: Recommend next experimental setting
The function f is expensive to sample due to the potential cost of experimental measurements, hence an alternate strategy is to build a surrogate function that is cheaper to compute and finds its maximum. These acquisition functions trade-off exploitation of the high predicted mean with exploration using high predicted variance. We use the expected improvement (EI) as the acquisition function, which defines the improvement over the current best value and is given as ref. 12
where . and are the CDF and PDF of standard normal distribution.
To maximise the acquisition function, we use DIRECT22, a deterministic and derivative-free optimiser. The resulting maximum is the recommendation for the next experimental setting.
Step 4: Deriving the combined target score y t
Consider x t as the vector of experimental parameters (solvent flow rate, polymer flow rate, device angle, device position and channel width) at iteration t. We formulate quantitative scores, and as the difference of the measured fiber median length L t and fiber median diameter D t to their respective targets L T and D T as:
The maximum values and are chosen as target bounds beyond which the fibers are not useful. In our case, , and .
Qualitative Score - f Q (·) and interface for qualitative comparison
The qualitative score captures aspects that are difficult to measure quantitatively for example, uniformity and nature of the precipitate’s morphology, amount of by-product, homogeneity etc. It is difficult to rank each fiber sample image on a pre-determined and uniformly-spaced scale which may cover all types and qualities of samples, since at any point of the experiment the iteratively-collected samples are not representative of the whole spectrum of possible samples and the scaling might change as a result and be prone to high subjective bias. Possible solutions include the preparation of a representative sample for each unit of the scale, but this would void the efforts for increased efficiency.
An alternative strategy is to use a visual interface to compare pairs of images, scoring them as either better, worse or difficult to tell. Since this categorisation is broad, subjective bias is reduced. Although it would be useful, it is not practical to rank the current image against all previous images. At any iteration t, the method asks the user to compare the images of the fibers produced in the current iteration ( produced by parameter configuration ) with a set of images randomly chosen from previous iterations. If image generated from configuration is preferred over image from configuration , we get . If is preferred over , we get . If it is difficult to tell, both relationships are maintained: , and which ensures that the qualitative scores returned by Equation (9) are close.
This process gives rise to a set of observed pairwise preferences at iteration t that we denote as the preference set :
We formulate a method to extract qualitative scores from such pairwise comparisons. The number of comparisons scales logarithmically with iteration number t as . The interface for image comparison is shown in Fig. 3. The user can indicate which image is better or if it is difficult to tell. The qualitative assessment of the samples favours high fiber yield, low entanglement (homogenous dispersion) and absence of by-products such as debris and spheres. In this example, the “A” sample on the left is preferred because it is more homogeneous, even though some debris is present. The “B” sample is cleaner, but is not homogeneous, and contains some very large fibers.
From this data, we learn the latent score function f Q at each iteration.
Computation of f Q
Given a preference set defined in Equation (7), we use a Gaussian process to learn the latent relation that explains this partial ranking across the images at each iteration. As the set of images differ between iterations, the random choice at each iteration will produce different images to compare. The function explaining the latent ranking relationship will keep improving but will change from one iteration to the next.
The unobserved latent function is associated with each sample x i . We impose a Gaussian process prior on these latent values and derive the latent values based on an approximate likelihood function. We assume that the latent function is contaminated with Gaussian noise .
Suppose that the preference set contains M preference pairs and denote the mth preference pair . The likelihood function is defined as ref. 23
where and .
We want to maximise the posterior distribution of the latent function, or
where is the prior probability and is a multi-variant Gaussian function with the covariance function . Computing the maximum a posterior (MAP) estimate of the latent function is equivalent to minimising the following function
We can employ the Newton-Raphson approach to find the solution for simple cases. By setting , we get the MAP estimate of f LQ as
where . Further details of this approach are described in the literature23. To avoid over-stretch or over-squeeze on quality scores, we use the following normalisation approach
where and are the minimal and maximal value of respectively; and are the normalised quality scores.
Combining quantitative and quality scores
We combine the multiple objectives into a single objective score. Since and are difference to the targets, the values need to be minimised while needs to be maximised because higher values are indicative of better quality. The combined utility score is defined as proportional linear combination. In the absence of any prior knowledge, we assign equal weight (1/3) to length, diameter and quality.
The score has range [0, 1] and is 0 when all objectives are met. Our optimiser seeks to minimise this score, which increases with the deviation of product fibres from the target. This score will not decrease monotonically because the objectives may get better or worse individually across iterations. To track the objective function progress towards targets we define the (BFV) to be the lowest value of found until iteration .
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Our experiments aim to (1) find optimal conditions for manufacturing different fibres, and (2) validate the Adaptive Experimental Optimisation (AEO) method. Fiber production experiments were conducted iteratively, varying the five input variables (solvent flow rate, polymer flow rate, device angle, device position and channel width), until the product dimensions converged to within 20% of the target length and diameter. Initial trials found this to occur within about 20 iterations, but this number will vary with process and will rise as tolerance decreases.
We performed a series of experimental runs, each consisting of a total of 20 iterations guided by AEO, starting from five initial chosen synthetic conditions, which were either randomly-chosen or known-outcome conditions (random, bad, good). This process was repeated for three different fiber targets: (1) median length 70 μm and median diameter 1 μm, (2) length 40 μm and diameter 0.6 μm, and (3) length 50 μm and diameter 0.4 μm. For Target 1 we performed three runs with random starting states, and one run each from known “good” and “bad” starting states. For Targets 2 and 3 we performed two runs from random starting states. Table 1 summarises the results of optimisation experiments for three target fibers. The table lists the best found value (BFV), the iteration at which it was found, and the difference to target as a percentage of target length (L%) and diameter (D%). Most runs start from 5 randomly chosen states. We present here three typical results: target achieved from a random starting state, target achieved from a “bad” starting state, and target not achieved.
Figure 4 shows results for Target 1, a fiber of median length 70 μm and median diameter 1 μm. The BFV is shown in part (a). Parts (b) and (c) show the corresponding difference to target length and diameter as a percentage of the target. Iteration steps without circles have values above the maximum scale for the axis. The solid line indicates BFV, and the images at samples where the BFV is improved are shown in (d). The lowest BFV is achieved at iteration 14, with a 4.8% distance from target in length (b) and 0.5% distance in diameter (c). Note that the objective function (shown by circles in (a)) does not decrease monotonically, since at each iteration one criterion may improve while others may worsen.
Further analysis for Target 1 (L = 70 μm, D = 1 μm) is considered by varying the starting conditions. Instead of starting from randomly chosen conditions, five states that are known (from preliminary “seed” experiments) to lead to a product which is very different from the specified target (Supplementary Fig. S.1) are used. The target set were reached within 12 iterations.
For Target 1 (L = 70 μm, D = 1 μm) runs labelled “Good” and “Bad” start from the five best and worst samples achieved in the previous three runs. Note that the “Bad” run finds the same solution as Run 1, which is within 0.5% for target length, and 4.8% for target diameter. The BFV for this result is lower than achieved for Run 1 (0.161 < 0.221) suggesting that a “bad” start may cause a subsequent increase in the quality ratings.
Table 2 summarises the differences between targets. Target 1 (L = 70 μm, D = 1 μm) and Target 2 (L = 40 μm, D = 0.6 μm) show similar values for mean BFV and iteration. However, the average L% is significantly higher for Target 2, with fewer experiments showing a low value for L%, indicating that the process produces a large portion of fibers of length above 100 μm. Target 3 (L = 50 μm, D = 0.4 μm) shows a higher (worse) BFV of 0.32, and none of the tested conditions achieved the target diameter of 0.4 μm. Both runs find the same solution (L% = 2.7, D% = 104) within 4 iterations, but no subsequent improvement was obtained in the next 16 iterations. It should be noted that this finding is to be expected in some scenarios because defining a target does not imply that the system is capable of achieving it.
The AEO process described has allowed fast optimisation of a complex and multi-variable “black-box” system, where short polymer fibers of specified target diameter, length and quality could be produced using bespoke devices based on a small pool of preliminary data. The necessity to combine heterogeneous criteria is very common in materials synthesis process optimisation, as are high-degrees of complexity and low predictability of systems studied. The SPF production process here described involves a complex interplay of forces to result in the product, and a variety of different material properties can be produced as a result. The SPF process is dominated by local fluid-dynamics and the thermodynamics of polymer dope solidification. As the filament is flowing through the dispersant, it undergoes both elongation and coagulation. The forming short fibres therefore exhibit continuously-varying viscosity (which also displays a radial gradient within the filament) and experience a continuously-varying flow field (external forces). Modifying control parameters such as flow rates, therefore, has a double effect: on the fluid-dynamics and on the viscosity of the filament as function of position and time. Additional complications in attempting to build an accurate micro-mechanical model include: the low-level accuracy in the local flow-field knowledge, the non-constant-flow output by the pumps (which is pulsating), and the effects of device roughness. This system is, therefore, best optimised through output analysis, but the large matrix of conditions cause this to be a very time-consuming challenge. Its optimisation through iterative processes guided by AEO was here demonstrated possible. We set about controlling not just the quality and uniformity of the produced sample morphology, but also the sample’s median dimensions, in a system which could suffer from non-orthogonality in the process parameter-outcome relationship against the different criteria. The process has allowed us to combine different types of information: quantitative (scalar) and qualitative (categorical, non-uniform) and to optimise the combination of the three.
The AEO process will never self terminate as the iterative process loop will continually search for optimal states. Thus we can never say with certainty that a particular experimental setup is not capable of reaching a target. However, tracking the BFV can provide useful insight. If BFV does not decrease further with increasing iterations, it is likely but not certain the experimental system will not meet the chosen target. Supplementary Figure S.2 shows the results for an unachievable target: L = 50 μm, D = 0.4 μm. From iterations 2–20, the BFV does not decrease. Though the length target was found to be achievable, the deviation from the diameter target remained poor (>100% from target). In a real setting, it is likely that an experimenter will approach the results in an informed manner, and will be able to apply judgement to a non-improving BFV, where also taking into account cost of experimentation vs likelihood of results.
In order to design and operate the process it is important to know which variables have the strongest influence on performance. To estimate this we compared each state with the length and diameter of the resulting fibers, performing second order polynomial fit using samples over all 9 experiments, and noting the value of the resulting correlation coefficients R. The correlations between process parameters and the Length and Diameter is contained in Supplementary Table S.1. The angle and position have very little influence on the characteristics of the produced fibers. Polymer flow has a moderate influence (0.37) on length, but only a weak influence on diameter. Thus the most significant influence on overall performance is solvent speed followed by channel width. Sensitivity analysis shows that solvent speed also accounts for the majority (34%) of the uncertainty. See Supplementary Table S.2 for more details. While it is to be noted that these results are only valid within the set of tested experiment states, they provide an initial useful set of indications on the system as a whole.
While offering significant advantages for the optimisation of material properties and process configuration for SPF examples, there remains significant scope to expand the capability of experimental optimisation through AEO to other experimental systems. New methods must be found to increase the number of input variables and output target variables. Current algorithms finding systems with more than just 10 control variables is a challenge24, 25. The calculation of Expected Hypervolume Improvement, a key step in multi-objective optimisation quickly becomes computationally unfeasible as the number of objectives rises26. These problems present open computational challenges.
We have formulated and demonstrated Adaptive Experimental Optimisation (AEO) incorporating both material properties and process characteristics within a coherent experimental optimisation framework. Heterogeneous and multiple objectives can be successfully combined for experimental optimisation with the method. We show how to incorporate both qualitative and quantitative targets in an unified experimental optimisation setting. An efficient methodology to systematically elicit qualitative information about the experimental merit is presented. Through the AEO methodology, we have demonstrated how a developmental experimental system such as SPF synthesis can be efficiently directed to achieve material and process objectives. We show the potential for using the AEO methodology as a new kind of intelligent agent where the experimenter and the agent explore the new space together providing a mechanism for yielding greater insight into the relationships between critical variables, and a path to more efficient optimisation of novel materials and processes.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was supported under Australian Research Council’s Industrial Transformation Research Hub funding scheme (project number IH140100018). The present work was carried out with the support of the Deakin Advanced Characterisation Facility. The authors are thankful to Mr Keiran Pringle for assisting with device manufacture.
Electronic supplementary material
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.