Advanced machine learning decision policies for diameter control of carbon nanotubes

The diameters of single-walled carbon nanotubes (SWCNTs) are directly related to their electronic properties, making diameter control highly desirable for a number of applications. Here we utilized a machine learning planner based on the Expected Improvement decision policy that mapped regions where growth was feasible vs. not feasible and further optimized synthesis conditions to selectively grow SWCNTs within a narrow diameter range. We maximized two ranges corresponding to Raman radial breathing mode frequencies around 265 and 225 cm−1 (SWCNT diameters around 0.92 and 1.06 nm, respectively), and our planner found optimal synthesis conditions within a hundred experiments. Extensive post-growth characterization showed high selectivity in the optimized growth experiments compared to the unoptimized growth experiments. Remarkably, our planner revealed significantly different synthesis conditions for maximizing the two diameter ranges in spite of their relative closeness. Our study shows the promise for machine learning-driven diameter optimization and paves the way towards chirality-controlled SWCNT growth.


INTRODUCTION
The utilization of single-walled carbon nanotubes (SWCNTs) in a wide range of applications, including "Beyond Moore's Law" computing 1 , hinges on our ability to control their physical properties 2 . A key property is the tube diameter, which is intimately related to its electronic properties 3 , making diameter control of SWCNTs important for semiconducting device applications 4 . In addition, control over SWCNT diameters is desirable for applications such as molecular transport 5 , membranes 6 , field emission 7 , and sensing 8 . To date, there are many published reports on diameter-controlled growth of SWCNTs by chemical vapor deposition (CVD), the most popular synthesis method. As the SWCNT diameter is defined primarily by the size of the catalyst particle 9,10 , most of these reports focus on synthesis from catalysts with defined sizes or compositions [11][12][13][14][15][16] . Other reported alternatives for controlling SWCNT diameters involve careful selection of the growth substrates 17,18 and manipulation of the type and amount of hydrocarbon feedstocks [19][20][21][22] .
A major drawback of trying to control particle shape and size during CVD growth is the inherent difficulty in mitigating several competing effects that occur at high growth temperatures (typically 600-900°C). These effects include diffusion of particles on and into the substrate, and particle coarsening [23][24][25] , which result in the broadening of the particle size distributions and, consequently, SWCNT diameters. One way to reduce these effects is to lower the growth temperature or develop schemes to immobilize the particles, and such methods have had limited success, although often at the cost of a decrease in yield 26 . On the other hand, manipulation of the hydrocarbon feedstock can lead to contrary results, owing to the wide range of feedstocks used for growth 27 . For example, higher amounts of hydrocarbon precursors such as ethane, ethylene, and methane result in larger tube diameters 19,21,22 , whereas others such as CO result in smaller diameters 22 . These effects are further complicated with the inclusion of additives such as NH 3 , H 2 O, or CO 2 , which can increase or decrease the average diameter of the SWCNTs through etching or selective catalyst particle deactivation [28][29][30] .
The precursors and additives used in CVD can be classified as catalyst reducing (e.g., hydrocarbons and CO) or catalyst oxidizing (H 2 O and CO 2 ). The conflicting results published in the literature suggest that the SWCNT diameter could be tuned by balancing the ratios of reductants and oxidants, in addition to altering the synthesis temperature. Unfortunately, the few systematic studies reported so far have been restricted to the influence of only one or two growth parameters 21,31 . Here we considered three synthesis parameters, namely temperature and the amounts (partial pressures) of reductants (ethylene, acetylene, and hydrogen) and oxidants (CO 2 and system water vapor) to tune the growth of SWCNTs with diameters in a narrow range. We achieved this by employing a machine learning (ML) planner to optimize diameter-selective growth of SWCNTs from a 1 nm-thick cobalt catalyst film.
ML algorithms are increasingly being utilized in materials research and can optimize a variety of processes [32][33][34][35][36][37][38] . We have also previously used ML (random forest and Bayesian optimization) to optimize the average growth rate and the yield of SWCNTs using in situ Raman spectroscopy as the feedback method 39,40 .
Here we focused on optimization of the selective growth of smalldiameter carbon nanotubes (CNTs) through their diameterdependent low-frequency Raman radial breathing modes (RBMs). An ML planner based on the Expected Improvement (EI) decision policy was used to map the growth parameter phase spaces for maximizing the intensities of two RBMs at 225 and 265 cm −1 , which correspond to SWCNT diameters around 1.06 and 0.91 nm, respectively. Analysis of the RBM distributions with four laser excitation wavelengths showed that roughly a third of the SWCNTs in the optimized experiments lie in the desired diameter range. Notably, the growth parameter space for maximizing the intensities of these two RBMs was significantly different in spite of their closeness in diameters. Our results show that it is indeed possible to use ML to control SWCNT diameter and point the path forward towards scaled and selective synthesis of diametercontrolled CNTs.

RESULTS AND DISCUSSION SWCNT growth
The SWCNT growth experiments were performed in ARES TM (Autonomous Research System), an automated high-throughput laser-induced cold-wall CVD system coupled to a Raman spectrometer (experimental details in the "Methods" section) 39 . In the past, ARES has provided insights into chiral angle-dependent growth rates and defects in SWCNTs 41,42 , optimal phase spaces for CNT growth [43][44][45][46] , new alloy catalyst compositions for SWCNT growth 47 , and into defect evolution in two-dimensional materials [48][49][50][51] . Here we grew SWCNTs from a 1 nm Co catalyst film deposited using ion beam sputtering onto 10 nm Al 2 O 3 -coated Si micropillars. We used ethylene and acetylene as the hydrocarbon source. It is noteworthy that both ethylene and acetylene can reduce the catalyst film, and are therefore considered reductant species 52 . Along with the hydrocarbons, we used hydrogen as an additional reductant and CO 2 as the oxidant. We also measured the base water vapor pressure in the system, which was <1 p.p.m. The growth temperatures and pressures ranged between 500 and 900°C, and between 2 and 50 Torr, respectively. Under these conditions, we typically grow SWCNTs in the form of bundled mats 41,[43][44][45] . Supplementary Fig. 1 shows a representative scanning electron microscope image of the SWCNTs. Figure 1 shows a Raman spectrum in the low-frequency RBM range, collected from one of our typical growth experiments, which was not optimized for SWCNT diameter selectivity. This spectrum was collected post growth in ARES, with an excitation wavelength of 532 nm. At least four distinct RBMs at 162, 189, 225, and 265 cm −1 can be identified in the spectrum. The Raman spectrum is plotted beneath the so-called Kataura plot 53 , which maps the electronic transition energies of each SWCNT as a function of its diameter (top horizontal axis in Fig. 1, calculated according to the relation ω RBM = 235.9/d t + 5.5 54 , where d t and ω RBM are the diameter and RBM frequency, respectively). The Raman intensity of a SWCNT is enhanced when it is in resonance, i.e., when the excitation laser energy (denoted in the top panel in Fig. 1 by the solid horizontal line and is 2.33 eV in our ARES experiments) is close to one of the electronic transition energies of the SWCNT 55 . The resonance-enhanced intensity occurs over an energy window of ±0.1 eV, indicated by the two dashed horizontal lines in Fig. 1. Thus, all nanotubes within the window can appear in the spectrum.

Support vector machine-constrained EI policy
The campaign's objective was to identify optimal CNT synthesis conditions that achieve two scientific goals, namely maximizing the number of SWCNTs of a particular diameter and selectively minimizing the synthesis of different-diameter SWCNTs. To achieve this, we framed the experimental parameter search as an optimization of a scalar objective function (OF) that considers the two goals described above and used a constrained version of the EI policy to select experiments. The policy was additionally constrained by a feasible set of synthesis conditions dynamically learned using a support vector machine (SVM) classifier in an attempt to explicitly avoid low OF values, which is an augmentation of the natural balance that EI achieves.
The desired outcome of our experiments was diameter selectivity and optimal yield. To measure how well this was achieved for SWCNTs synthesized with growth conditions x, we considered a scalar utility OF where A is the area under the Raman curve for a particular RBM peak of interest (i.e., 265 ± 10 or 225 ± 20 cm −1 ) and B is the area under the broader RBM spectrum region (100-350 cm −1 ). The non-dimensionalized quantity A=ðB À AÞ measures selectivity of peaks of interest with respect to the rest of the broader region. We treated both A and B as functions of the synthesis parameters x. A prefactor was incorporated with the OF to diminish the value for cases where the signal-to-background for A was low. The selectivity was balanced by the sigmoidal prefactor, taking values between 0 and 1. This depended on the absolute size of the peak A and its value was near 1 if the normalized quantity A σ surpassed a predefined threshold α, which was set to 100 counts. In Eq. (1), σ was set to the background level of the Raman spectrum, whereas the inverse length-scale parameter β ¼ 1=10 (counts −1 ) determined how fast the prefactor scaled from 0 to 1 for values A σ near the threshold α (further discussion on these model hyperparameters is presented in Supplementary Section 1). Although we have selected to fix the values based on visual inspection of the rankings the OF produced, these and other hyperparameters could be alternatively optimized through maximum likelihood optimization.
Five experimental inputs x were used, including the partial pressures of the reductants (ethylene, acetylene, and hydrogen), oxidants (water vapor and carbon dioxide), and the growth Bottom: representative Raman spectrum in the low-frequency RBM range from a typical non-optimized growth experiment in ARES (laser excitation at 532 nm). Top: Kataura plot showing the possible matches for SWCNTs (solid red data points) within the chosen frequency ranges (shaded areas). There are three matches for the RBM at 225 cm −1 and two for the 265 cm −1 peak. The dashed horizontal lines in the Kataura plot represent the resonance window (± 0.1 eV) for the SWCNTs.
temperature. As mentioned above, water vapor was found in trace amounts in our system (<1 p.p.m.) and was also accounted for in our partial pressure calculations. We have previously shown that the growth of SWCNTs is heavily mediated by a balance of two reduced order parameters forming the input vector z, which is derived from such experimental inputs x: (1) temperature itself and (2) the ratio between the partial pressures of reducing and oxidizing species (pressure ratio) 56 . By combining the partial pressures of all the active precursors into a ratio of reductants over oxidants (hereafter called the pressure ratio), we were able to parse the growth results in terms of a reduced two-dimensional space (temperature vs. pressure ratio). These two features could therefore be used to determine whether SWCNTs grew on a catalyst nanoparticle or whether oxidation effects prevented such growth outright. Moreover, the critical region of growth could be delineated in this reduced-dimensional space such that the transition from the feasible "growth" phase to the infeasible "nogrowth" phase was sharp. Throughout the campaign, we assumed that the location of this phase boundary in the reduceddimensional space was unknown a priori, and that planned experiments had to be selected and run, to learn the true feasible region in addition to the primary goal of discovering optimal synthesis. Typically, phase-change phenomena leading to abrupt differences in experimental responses due to varying synthesis conditions across phase boundaries may not be properly modeled using Gaussian processes 57 , because they often presume smoothness in the functions they model. Thus, avoiding such "no-growth" experiments and data lying on the phase boundary itself can help train models in which such smoothness constraints are fully satisfied by the data properly inside the "growth" region, far from the "no-growth" region. Hence, when deciding what experiments to run, our planner avoided suggesting synthesis conditions x that mapped z ¼ hðxÞ to the low-dimensional features z lying in the as-predicted "no-growth" region of the reduceddimensional space.
To model the phase boundary in the reduced-dimensional space, we used a kernel SVM classifier and automatically labeled the experimental data into "growth" or "no-growth" classes using a thresholding parameter applied to the observed OF values. That is, given the experimental data ðx; f ðxÞÞ, we trained the kernel SVM classifier on the data hðxÞ; χ f ðxÞ>fmin , where χ f ðxÞ>fmin is an indicator function We used the squared-exponential SVM kernel and a threshold value of f min ¼ 0:01 (arbitrary units).
Given a data set of n experimental inputs x and the corresponding observed (and presumed noisy) OF responses f ðxÞ, we formed a Gaussian process Bayesian belief GPðμ n ; Σ n Þ, where μ n ðxÞ is the time-n prediction of the true OF and Σ n ðx; x0Þ is the corresponding covariance function that provided the predicted statistical relationship between the OF function values at pairs of inputs x and x0. This Gaussian process belief was used for the EI decision-making policy (discussed in greater detail in section Supplementary Section 2) 58 , which attempted to select the next experiment to run with the goal of balancing between resolving uncertainties in the prediction of the OF (called exploration) vs. focusing on promising regions of the response, based on current beliefs (exploitation). Such a balance between exploration and exploitation must be maintained in iterative, closed-loop experimental campaigns due to the lack of data available for most campaigns. Initial beliefs, which are based on a limited set of seed data, may have inaccurate predictions. Therefore, it would be premature to trust such predictions, in particular optimal regions delineated by such predictions. Conversely, the campaign had a specific optimization objective and so a limited experimental budget should not be wasted in generic learning of the response in order to build a globally accurate surrogate model. This balance was captured by the EI policy acquisition function AcqðxÞ, which evaluated potential experiments to run next according to a predefined measure of the explorative and exploitative value of running the experiment. The policy selected the next experiment as the one that maximized AcqðxÞ, subject to the constraint that the experimental inputs mapped to the feasible region in z-space predicted by the SVM classifier. That is, the policy selected an experiment by solving the constrained optimization where X n was the set of synthesis conditions x such that z ¼ hðxÞ was predicted to be in the "growth" region by an SVM classifier trained on the n available data points. If the training data does not contain examples of both growth and no-growth experiments, which is typical early in the campaign, the feasible set is simply defined to be the entire domain and the optimization of the acquisition function is unconstrained. Supplementary Fig. 2 shows the evolution of the SVM decision boundary over the first 25 experiments and demonstrates the fact that the SVM is only trained and used to constrain the optimization after 10 experiments.

Maximization of the 265 and 225 cm −1 RBMs
Our first target was to maximize the growth of SWCNTs with RBMs around 265 cm −1 . Three initial seed growth experiments were conducted in ARES, with growth conditions selected such that they corresponded to our typical experiments as published previously 39 . Including the three seed experiments, we performed a total of 74 growth experiments. Most of the experiments selected by the planner used the EI policy (learning mode). Exploitation experiments were selected every four to five experiments (exploitation mode) to track the evolution of the estimated best growth conditions. Following each growth experiment, a Raman spectrum was collected at room temperature with a minimal laser power of 0.2 W and 30 s acquisition time.
The low-frequency RBM regions between 100 and 350 cm −1 in these post-growth spectra were fitted with Lorentzian peaks (an example of a fitted spectrum in the RBM range is shown in Supplementary Fig. 3), enabling the calculation of the quantities A and B, and the OF as described above.
The results of all the growth experiments maximizing the area of the 265 cm −1 RBM peak are plotted in Fig. 2a, which shows a heat map of the OF against the growth temperature and the pressure ratio (partial pressures of reductants/oxidants). It is clear from Fig. 2a that the experimental conditions that maximized the 265 cm −1 RBM peak were confined to a relatively narrow range (indicated by the red dashed ellipse in Fig. 2a). These conditions corresponded to an average growth temperature around 700°C and oxidizing conditions (pressure ratio~0.015). Figure 2b shows a Raman spectrum collected post growth from a growth experiment corresponding to the highest OF (0.33). It is interesting to note that even for the highest OF, there are two RBMs observed in the spectrum-at 225 and 265 cm −1 .
Next, we performed 41 additional experiments to maximize the area of the RBM at 225 cm −1 and the result of these growth experiments is shown in Fig. 2c. The post-growth Raman spectrum corresponding to the highest OF is shown in Fig. 2d. Comparing the data in Figs. 2a, c, it is immediately apparent that the optimal temperature and the pressure ratios were different for the two RBMs. Unlike the case for the 265 cm −1 peak, the growth conditions that maximized the 225 cm −1 peak were more reducing (ratio of reductant/oxidant pressures around 1) and at higher temperatures (average growth temperature~800°C). The differences between the data presented in Fig. 2a, c are remarkable considering that the diameters of the SWCNTs corresponding to these two RBMs, namely 225 and 265 cm −1 , are very close,~1.06 and 0.91 nm, respectively. The observation of totally different synthesis parameters for maximizing SWCNTs with similar diameters highlights the power of our EI decision policy planner, which produced these results in under a hundred experiments. Such a result would normally have taken over several months to produce with traditional synthesis methods, whereas the ARES growth experiments were performed over a couple of weeks.
To map the chiral distribution of SWCNTs in the optimized growth experiments, we performed additional characterization of the RBMs using multi-excitation Raman spectroscopy in a Renishaw inVia Raman system. For this, we used three other laser excitations-514.5, 633, and 785 nm; these three laser excitations cover the majority of SWCNTs with respect to their electronic transition energies and are commonly used for characterizing SWCNT populations using resonance Raman spectra 59,60 . Twodimensional Raman spectral maps were collected on a number of ARES micropillars (11 × 11 µm areas with 1 µm spacing between points) and the average spectrum from each map was obtained. To emphasize the effectiveness of our planner, in Fig. 3a, b we show average multi-excitation Raman spectra corresponding to optimization of the 265 cm −1 RBM peak with a high (OF = 0.33) and low (OF = 0.03) OF, respectively. Additional multi-excitation Raman spectra from other experiments are shown in Supplementary Fig. 4. All spectra in the RBM region in Fig. 3a, b were normalized with respect to the highest intensity peak within the RBM frequency range (100-320 cm −1 ). The vertical dashed lines in Fig. 3a, b correspond to the RBM range targeted in our experiments: 255-275 cm −1 .
A number of observations can be made from Fig. 3a. First, all the spectra contain more than one RBM. Second, the spectrum collected with only one of the excitations-514.5 m-contains a predominant peak at 265 cm −1 . The multi-excitation Raman spectra show that most of the RBMs lie within a relatively narrow range of frequencies (~220-280 cm −1 ), corresponding to a diameter range between 0.86 and 1.1 nm. This range is slightly wider than the range requested in the ARES optimization experiments −0.91 ± 0.04 nm. In contrast, there is no prominent peak at 265 cm −1 for the growth experiment that exhibited a low OF (0.03, Fig. 3b); in fact, all the RBMs appear at lower frequencies, indicating that poor optimization (low OF values) resulted in the growth of larger diameter SWCNTs.
To get a clearer idea of the chiral distribution of these SWCNTs, we recast the data from Fig. 3a, b onto the Kataura plots in Fig. 3c, d, where the relative intensities of the peaks in Fig. 3a, b are reflected in the size of the circular data points. We note that the data in these Kataura plots do not show the actual abundance of the chiral distribution, owing to the difficulty in estimating accurate SWCNT densities from the bundled mats on the surface of the Si micropillar. Rather, we show these plots for the purpose of visualizing the average chiral distributions as obtained from the multi-excitation Raman spectra. It is immediately apparent from Fig. 3c, d that we grew both semiconducting and metallic SWCNTs, indicating that there is no preference for conducting type in both populations. Moreover, a tally of all the RBMs by conducting type results in a semiconducting to metallic SWCNT ratio of 65 : 35, similar to what is commonly observed in ensembles of CVD-grown SWCNTs 55 . The highlighted rectangular boxes in Fig. 3c, d denote regions with the highest RBM peak intensities and the sharp contrast between the high and low OF can be seen clearly in the overlap (or lack thereof) between the highlighted box and the vertical dashed lines indicating the targeted RBM range.
The results presented in Fig. 3a-d can be compared to the Raman spectra collected from the campaign of experiments to maximize the 225 cm −1 RBM peak. The multi-excitation spectra for experiments that exhibited high and low OFs are shown in Fig. 3e, f, and the corresponding Kataura plots in Fig. 3g, h. Similar to the case of the 265 cm −1 RBM, the data in Fig. 3e reveals highintensity RBMs close to 225 cm −1 , indicating a narrow diameter distribution. The corresponding Kataura plot in Fig. 3f shows that the high-intensity RBMs correspond to SWCNTs with diameters that range between 1.05 and 1.35 nm. However, the big difference between the high and low OF growths in the case of the 225 cm −1 RBM campaign is the much wider SWCNT diameter distribution in the growths with low OF. This can be seen clearly in the larger size of the shaded rectangle in Fig. 3h.
By analyzing the intensities of the RBMs in the multi-excitation Raman spectra for the two growths that exhibited high OFs (Fig.  3a, e for RBMs at 265 and 225 cm −1 , respectively), we found that 31% of the total RBM intensity was in the range 265 ± 10 cm −1 . If we expanded the range to 220-280 cm −1 (i.e., SWCNT diameters ranging between 0.86 and 1.1 nm), the RBM intensities accounted for~71% of the total intensity between 100 and 350 cm −1 . This trend is similar in the case of the 225 cm −1 RBM:~35% of the RBM intensity lay between our selected range (225 ± 20 cm −1 ) and 63% if we expanded the range to 180-230 cm −1 . As mentioned above and seen in Fig. 2a, c, the growth conditions for the experiments that resulted in low and high OFs were different. For the 265 cm −1 RBM, temperatures for the high and low OF experiments were 700 and 750°C, respectively, whereas they were 820 and 830°C for the 225 cm −1 RBM. Although the differences in temperatures could account for shifts in SWCNT diameter distributions, it is likely that the results presented in Fig. 3 were caused by the significantly different pressure ratios between the high and low OF growth experiments. For the growth experiments maximizing the 265 cm −1 RBM, the pressure ratios for the high and low OF experiments were 0.015 and 0.33, respectively, whereas these values were 1.08 and 0.009, respectively, for the 225 cm −1 RBM. The orders of magnitude difference between the pressure ratios highlights the effectiveness of tuning SWCNT diameter distributions through control over the feed rates of the gaseous precursors.

SWCNT yield
Efforts to increase selectivity during SWCNT growth typically suffer from a drop in the yield, owing to a variety of factors such as low catalyst activities and difficulty in controlling the size and morphology of a large number of particles during hightemperature CVD growth 26 . As mentioned above, in our experiments we grew sparse SWCNT bundles on the ARES micropillars, making it difficult to accurately estimate the growth yield through counting or weight gain measurements. However, the overall intensity of the SWCNT G band (G max. ), collected at the end of each experiment in the post-growth Raman spectrum could be used as a proxy for the yield. In order to see the effect of the planner on the diameter selectivity and yield, the data in Fig. 2 were replotted by multiplying the OFs with the respective G max. values (normalized by the intensity of the Si substrate peak at 520 cm −1 ). Figures 4a, b show the heat maps of the OF * G max. against the growth temperatures and pressure ratios for growth experiments maximizing RBMs at 265 and 225 cm −1 , respectively. In the case of the 265 cm −1 RBM, the high-yield area corresponds to the same temperature and pressure ratios as the area that exhibited high OFs or diameter-selective growth (Fig. 2a). However, the highest yields for the 225 cm −1 RBM correspond to different synthesis conditions compared to the regions of high diameter selectivity. As discussed above, the 225 cm −1 RBM was maximized at a temperature of~800°C and a pressure ratio around 1 (Fig. 2c). When we take the SWCNT yields into account, we see that the highest yields occur for more oxidizing conditions, for pressure ratios between 0.1 and 0.01. In fact, these partial pressures are close to those for the 265 cm −1 RBM (~0.015, Fig.  2a). Multi-excitation Raman analysis of the RBMs showed that the intensity of the 225 cm −1 RBM ( ± 20 cm −1 ) dropped down tõ 26%. This result, and the differences in the pressure ratios shown in Figs. 4b and 2c corroborate the inverse relationship between yield and selectivity. However, importantly, our studies, enabled by the exploration of a wide range of precursor pressures, show that it is possible to find conditions in the synthesis phase space leading to high diameter selectivity and high yields.

Exploration vs. exploitation analysis
To analyze the efficiency of the planner, in Fig. 5 we plot the OFs over the course of the experimental campaigns to maximize the 265 and 225 cm −1 RBMs. As mentioned previously, the first three experiments of both campaigns were manually chosen seed experiments and are indicated by the black data in Fig. 5. Experiments conducted in exploration and exploitation modes are indicated in Fig. 5 by blue and red data points, respectively. Looking at the progression of the OF during the campaign to maximize the RBM at 265 cm −1 (bottom panel in Fig. 5), it is clear that the highest OF was achieved in the 27th experiment in the exploitation mode. The experiments that exhibited the highest OFs are indicated by the arrows in Fig. 5. Other than one of the experiments (#58), the OFs in the exploitation mode were also consistently high, averaging 0.24 ± 0.06 through the 74experiment campaign. The high OFs obtained in the exploitation mode are not surprising considering that the planner chose the temperature and pressure ratios such that they produced high OF values based on the three seed experiments (average 0.16 ± 0.08).
However, it is interesting that the high OF values in both experimental campaigns were greater than the OFs in the seed experiments. On the other hand, not surprisingly, the experimental conditions were chosen over a much wider range of conditions in the learning mode, leading to a large variety in OF values. Similar trends can be seen in the OF values for the other experimental campaign to maximize the RBM at 225 cm −1 (top panel in Fig. 5). Four experiments were conducted in the exploitation mode, with an average OF around 0.88 ± 0.17; the highest OF (1.12) was achieved in the tenth experiment.
There is a similar balance between exploration and exploitation that occurs during the learning and utilization of the correct growth vs. no-growth classifier. We note here that the decisionmaking policy is attempting to solely optimize the OF. In particular, we make no attempt in selecting experiments that balance between this optimization and the learning of an optimal classifier in the two-dimensional feature space. There are various heuristics and approaches in attempting to augment the decisionmaking procedure to accomplish this auxiliary learning, including performing Bayesian optimization with unknown constraints 61 and a more generic framework called the Mean Objective Cost of Uncertainty 62 . In the present setting, however, the balance is achieved in an emergent, albeit not necessarily optimal, manner. Specifically, the choice of experiments in the growth region leads to potential class imbalance favoring the "growth" classes. This imbalance in turn leads to an overestimation of the growth region, which may lead to the selection of no-growth experiments, mitigating such an imbalance. The optimization of this balance could potentially increase the effectiveness of this technique. More broadly, the use of the SVM classifier in this data-limited setting to specify feasible regions over which we make decisions under uncertainty represents a mixture of Bayesian and frequentist methods. This mixture merits further discussion, which we have presented in Supplementary Section 3.

Simulations
To further explore this system, we performed statistical simulations of subsequent experimental campaigns. Taking existing data obtained from the growth experiments on the 1 nm Co catalyst, we fitted a Gaussian process (GP) posterior belief f Þ on the observed OF values from the data. From this belief, we sampled ground truths and by doing so we obtained a family of sampled OFs f i x ð Þ, which were consistent with existing experimental data. For each such sample of the ground truth OF, we simulated an experimental campaign in a manner identical to the physical campaigns outlined above. However, in place of running an actual, physical experiment on ARES for a selected set of experimental conditions x, we instead sampled a  .
where W is an additive noise term drawn from a normal distribution with mean 0 and variance σ 2 W . Apart from using a statistically simulated experimental observation, such simulated campaigns proceeded identically to their physical analog on the ARES experiments, including the belief modeling, decision-making policies, and the use of classification models to define nonlinear feasible sets over which to optimize acquisition functions of the decision-making policies in a constrained manner. In particular, after n steps of the simulated campaign, we could predict the optimal synthesis conditions x ?;n based on the n simulated experimental data points available at the time. From this, we calculated the sampled ground truth OF value at this predicted optimum, y ?;n ¼ f i x ?;n ð Þ. As a measure of improvement, we then compared this best value to the best observed value from the existing experimental data set, y ?;data . For every simulation we ran, we calculated this difference to report a percent increase statistic: DðnÞ ¼ 100 ðy ?;n À y ?;data Þ=y ?;data (4) By running several simulations, we obtained distributions of this statistics, allowing us to make inferences about average-and extreme-case behavior to be expected from further simulations.
Such simulations allow us to therefore study the impact of various modeling or policy choices on the effectiveness of a closed-loop campaign. As a first study, we ran simulated campaigns to consider the effectiveness of the EI policy compared to a Pure Exploration policy that simply selects a feasible synthesis condition uniformly at random, and the Maximum Variance policy, which selects the experiment whose predicted OF has highest uncertainty, which is often used for Active Learning campaigns. We also considered the Upper Confidence Bound (UCB) policy, which selects experiments that maximize an optimistic estimate of the OF response given by an upper confidence bound 63 . For all policies, we ran 100 simulated campaigns each, sampling a different ground truth OF value for each campaign, followed by calculation of the D(n) statistic. Figure 6a shows the median values of D(n) vs. n for both the EI and Exploration policies, calculated over the 100 simulated campaigns. In addition, shaded regions indicating the 25th and 75th percentiles are shown. The figure indicates that the EI policy will, on average, result in a 13% increase of the best observed value over 20 experiments, with a spread of between 2% and 30% increase observed over the simulations. In contrast, pure exploration policies do not improve over an initial 3-4% increase, with a spread between 12% increase and a 6% decrease of the predicted best value observed over the simulations. We observe a similar performance from the maximum variance policy. Here, such a decrease would suggest campaigns in which, due to noisy observations of the sampled ground truth, an inaccurate belief of the ground truth results in a predicted best synthesis condition x ?;n whose corresponding ground truth value y ?;n ¼ f i x ?;n ð Þ actually yields poor performance. From these simulations, we see that the EI policy does outperform pure exploration, i.e., random sampling. The UCB policy performs the best overall. Interestingly, the UCB policy is a policy typically used for the multi-armed bandit problem, within which we wish to optimize the cumulative rewards (here, the sum of observed OF values) obtained throughout a campaign. Due to this, the policy, although attempting to balance between exploration and exploitation, more heavily favors the former, suggesting enough experiments are present for the campaign to transition more heavily to exploitation, even more than the natural balance present in the EI algorithm.
For another simulation study, we considered the impact of constraining experimental suggestions to a predicted feasible set on the campaign performance. Specifically, recall that an SVM classifier was trained on experimental data that was projected to the two-dimensional space of temperature and the ratio between oxidizers and reducers. These two quantities were used as features of the classifier. Class labels were obtained by thresholding the observed OF value here using Z-normalized data. Such normalized data points where this value y>y thresh surpassed a predefined threshold were labeled with a "growth" class, whereas those that did not surpass this threshold were labeled with a "no-growth" class. That is, y thresh is analogous to f min defined above, but applied to normalized data. Larger values of y thresh implied more constraints in selecting experimental actions, meaning selected experimental actions within a more constrained feasible set were predicted to result in more substantial growth. Conversely, smaller values implied less constraint and indicated a willingness to select experimental conditions that may result in low growth. Thus, a selection of y thresh can be considered as a specification of a risk tolerance level: larger values imply a risk-averse campaign, whereas smaller values imply a risk-tolerant one.
We ran simulation studies to consider the impact of such a threshold, considering threshold values of 0.1, −0.1, and À1, which corresponds to risk-averse, risk-tolerant, and unconstrained selection of synthesis conditions. For each threshold value, we ran 100 simulations and calculated the performance metric D(n), which is shown in Fig. 6b. We observe that for the risk-averse (y thresh ¼ 0:1) setting, campaigns obtained a 14% improvement over the existing data set over 20 experiments, on average. However, we observe that both the risk-tolerant and unconstrained settings perform better, obtaining an increase of around 19% and 20% improvement. For future experimental runs, the simulations suggest an increased risk tolerance would therefore be beneficial. These simulations offer a manner to calibrate the risk tolerance of a campaign when Fig. 6 Results from simulations of experimental campaigns. a Median and the interquartile range for the Percent Increase statistic for an EI-driven and Exploration-driven campaign. The EI campaign resulted in a median increase of around 12%, whereas the Exploration campaign resulted in no median increase. b Percent Increase for different risk levels, including risk-averse (threshold = 0.1) and more risk-tolerant (threshold = −0.1 and unconstrained) campaigns. In this context, risk-tolerant campaigns performed better than risk-averse ones.
determining experimental feasibility, along with other hyperparameters. Moreover, such a parameter could be dynamically tuned along with the changing beliefs of the response function itself. An implementation of such a dynamically tuned (via simulations) risk tolerance factor could improve the performance of closed-loop campaigns in which such experimental feasibility plays a significant role.

DISCUSSION
We used an ML planner based on the EI policy to map the growth parameter phase spaces for maximizing the intensities of two RBMs at 225 and 265 cm −1 , which correspond to SWCNT diameters around 1.06 and 0.91 nm, respectively. As optimization parameters, we considered the growth temperature and the partial pressures of reductants (C 2 H 4 , C 2 H 2 , and H 2 ) and oxidants (water vapor and CO 2 ). By considering the ratio of pressures of reductants to oxidants, the reduced dimensionality allowed us to construct growth condition maps from our results, which revealed significantly different optimal conditions for maximizing the intensity of RBMs around 265 vs. 225 cm −1 . These differences in growth conditions highlight the effectiveness of our planner, which achieved results in less than a hundred experiments. Our EI decision policy also enabled us to assess the effect of experimental risk on the rate of convergence and found that it had a significant impact. In the future, this approach could be extended to include thermodynamic models of catalyst reduction and rate equations of catalytic feedstock dissociation. Although the best diameter selectivity was around 35%, our methodology paves the way forward for achieving the ultimate goal, namely diameter-and chirality-controlled SWCNT growth.

METHODS CNT growth in ARES
CNT growth was performed in our custom-built in situ system named ARES. In ARES, a 6 W 532 nm laser (Verdi) serves as both the heat source and Raman excitation source, and is focused on a silicon substrate consisting of patterned micropillars on an SiO 2 underlayer (10 µm in diameter and height, fabricated by reactive ion etching). The substrates are loaded into a miniature high-vacuum chamber with an optical window whose environment can be controlled through automated pressure and gas mass flow controllers. Heating of the thermally isolated micropillars is achieved by varying the laser power, allowing rapid increases (within microseconds) in temperatures up to 1200°C. The scattered light from the micropillars is coupled to a spectrometer through focusing optics and a notch filter, enabling in situ Raman measurements. The micropillar temperature is estimated from the redshifted Raman peak frequency of the silicon micropillar. For this study, we first deposited a 10 nm alumina barrier layer onto the silicon micropillars by atomic layer deposition. Subsequently, 1 nm Co films were sputtered on the substrates using magnetron sputtering. The micropillar substrate was loaded into the growth chamber and evacuated to a base pressure of 10 −6 Torr, followed by backfilling with the growth gases. As described in the main text, we used the ratio of partial pressures of reductants (ethylene, acetylene, and hydrogen) and oxidants (water vapor and carbon dioxide), and labeled the pressure ratio, as chosen by the planner. Before and after each growth experiment, Raman spectra (pre-and post scans) were collected at room temperature using a low laser power (0.2 mW) and a 30 s acquisition time. During the course of each experiment, spectra were also collected in real time (every 3 s), and the RBM peak fitting was performed on the post scans following each experiment. These RBM frequencies were immediately fed into the planner to compute the parameters (i.e., the temperature and pressure ratios) for the next growth experiment.