A robotic platform for the synthesis of colloidal nanocrystals

Morphological control with broad tunability is a primary goal for the synthesis of colloidal nanocrystals with unique physicochemical properties. Here we develop a robotic platform as a substitute for trial-and-error synthesis and labour-intensive characterization to achieve this goal. Gold nanocrystals (with strong visible-light absorption) and double-perovskite nanocrystals (with photoluminescence) are selected as typical proof-of-concept nanocrystals for this platform. An initial choice of key synthesis parameters was acquired through data mining of the literature. Automated synthesis and in situ characterization with further ex situ validation was then carried out and controllable synthesis of nanocrystals with the desired morphology was accomplished. To achieve morphology-oriented inverse design, correlations between the morphologies and structure-directing agents are identified by machine-learning models trained on a continuously expanded experimental database. Thus, the developed robotic platform with a data mining–synthesis–inverse design framework is promising in data-driven robotic synthesis of nanocrystals and beyond. Trial-and-error synthesis and labour-intensive characterization procedures hinder the development of nanocrystals. Now, a data-driven robotic synthesis approach is used to prepare gold and double-perovskite nanocrystals. This approach combines data mining of synthesis parameters, robot-assisted synthesis and characterization, and machine-learning-facilitated inverse design of the nanocrystals.

Here we show how these limitations can be addressed by developing a robotic platform through a framework consisting of datadriven automated synthesis, robot-assisted controllable synthesis and morphology-oriented inverse design. We demonstrate this platform by synthesizing inorganic colloidal NCs, with an emphasis on their morphological control. Specifically, gold NCs (with strong visible-light absorption and no photoluminescence) and lead-free double-perovskite NCs (with photoluminescence and weak visible-light absorption) are selected as typical proof-of-concept NCs for research topics that are well known and emerging, respectively. Moreover, the correlations between tunable NC morphologies and key synthesis parameters are identified by ML models trained on a robot-assisted high-throughput experimental database.

Framework of the robotic platform
To overcome the limitations of robotic synthesis and explore the complex tunable morphologies of colloidal NCs, a new robotic platform was specifically developed for high-throughput synthesis and characterization of NCs. To design such a platform, the key steps in a typical research project performed by human scientists, namely, searching the literature, conducting the synthesis and characterization of NCs, and iterating throughout trial-and-error experiments to obtain optimized synthesis parameters, were all considered in the conceptual framework of the robotic platform. This new synthetic framework, as illustrated in Fig. 1, integrates data mining of the initial key synthesis parameters from the literature ( Fig. 1(1)), controllable synthesis of colloidal NCs ( Fig. 1(2)) and inverse design of targeted NC morphologies ( Fig. 1(3)).
In the framework, data mining of related literature is first conducted to provide initial choices of key synthesis parameters of NCs, such as the types or concentrations of surfactants. To illustrate the operation of the whole platform, two typical demonstrations, gold NCs and double-perovskite NCs, are selected for exploring research topics with abundant and relatively little published literature, respectively.
Based on the synthesis parameters obtained from data mining, we conducted high-throughput synthesis and characterization of NCs to investigate morphology-controlled synthesis (controllable synthesis), which has been a research hotspot in materials chemistry 31 . The key synthesis parameters that control the crystal morphology are identified as structure-directing agents (SDAs). The processes integrate Robotic Execution Excel files, a simulated operation system, and automated synthesis and characterization modules. Specifically, the Robotic Execution Excel files are designed for execution of the automated platform, which works as a user-friendly interactive interface between the platform and researchers, in which no programming skills are required. The experimental design is accomplished by writing the Excel file with information about the SDAs. The designed procedures written in the Excel file are pre-examined before the experiment, and monitored in real time during the robotic synthesis process by the simulated operation system ( Supplementary Videos 1 and 2).
The platform was built with the desired synthesis and characterization functionalities, as shown in Fig. 2. Photographs and a schematic representation of the platform are shown in Fig. 2a,b, respectively. Compared to the currently available platforms [4][5][6][7][8][13][14][15][16][17]19,20 , our platform, which is equipped with rapid optical characterization modules-such as a spectrometer (for ultraviolet-visible-near-infrared absorption measurement), a colour-ultrasensitive mobile camera (to obtain photographs and red-green-blue (RGB) values) and light sources (white light and ultraviolet light)-and integrated with automated synthesis modules and two collaborative robots, is specifically developed for automated synthesis and characterization of colloidal NCs. The typical properties of gold NCs (with strong visible-light absorption) and double-perovskite NCs (with photoluminescence) are robotically in a breakthrough for accelerating the discovery of materials 24 . However, in practice, robot-assisted high-throughput characterization remains less explored, whereas synthesis planning has been integrated with retrosynthesis to streamline robotic synthesis 15 . A software platform that could directly translate the organic chemistry literature into editable code to drive automated synthesis has been reported 19 . Moreover, advanced data-driven models have been recently applied to extract the organic synthesis parameters from patents for synthesis planning 22,23 .
The application of robotic synthesis, pioneered through the synthesis of organic materials, is expected to be expanded to the field of inorganic materials 24 . A state-of-the-art robotic chemist has been reported to significantly improve the performance of photocatalysts 4 , indicating the great potential for robotic characterization of inorganic materials. The adoption of a robotic platform in the preparation and characterization of lead halide perovskites has been reported 25 . Other lead-containing perovskite (MAPbBr 3 , FAPbBr 3 , CsPbI 3 , CsPbBr 3 (ref. 26 3 (ref. 28 )) thin films were further studied by incorporating automated synthesis, robotic characterization and machine learning (ML) techniques. Moreover, the combination of automation and AI has been applied to the discovery of palladium thin films 8 . However, data mining of the literature to drive the robotic search for targeted inorganic materials, especially NCs, has rarely been explored.
In the field of inorganic materials, data-driven initial hypotheses 22,23 , robot-assisted synthesis 4,6 and experimental databases 11,18,29 are promising for integration on robotic platforms to progressively acquire knowledge 30 , iteratively develop ML models 18 and efficiently reveal data correlations 16 . Nevertheless, limitations still exist that hinder the robotic synthesis of inorganic NCs, for example, the lack of data-driven models for translating synthetic goals into robotic synthesis 4   Through automated synthesis and characterization, SDAs and the corresponding characterization results are acquired. ML models are then trained to identify the correlations between SDAs and NC morphologies. Finally, inverse design, in which the desired morphologies are used to predict the SDAs used in robotic synthesis (morphologies → SDAs) 33 , is implemented to accelerate the synthesis of targeted NCs. Moreover, the database is continuously expanded through data mining, automated characterization and ML predictions, which provides constructive guidance for achieving morphology-oriented inverse design.
This framework ( Fig. 1) of data mining-synthesis-inverse design will be discussed in detail to demonstrate the capabilities of this robotic platform (literature search, NC synthesis and characterization, and correlation identification) in the following sections.

Data-driven automated synthesis
To plan and perform material synthesis, key synthesis parameters must be known, which are often determined on the basis of literature reports, trial-and-error experiments or the researcher's previous experience. Recently, synthesis parameters, such as solvents and ligands, were predicted by extracting experimental text from patents, using stateof-the-art data-driven models 22,23 . Although manual work cannot be easily fully replaced by data mining, the efficiency can be drastically boosted via automated literature searches compared with traditional trial-and-error experiments, and the dependency on the researcher's expertise can be greatly reduced. However, in practice, data mining of synthesis parameters for automated synthesis, especially for NCs, is an emerging research area.
In the current study, data mining of the literature was applied to drive the platform with the aid of an automated literature recommendation system 35 . Key parameters required for the synthesis of well-known gold NCs and less-known double-perovskite NCs were extracted from the literature in two typical ways, either directly from the specific literature of gold NCs or indirectly from the related literature of other perovskites ( Supplementary Fig. 1).
To demonstrate that the platform can address the difficulty of studying morphology-controlling synthesis variables among a large number of parameters in abundant published literature, gold NCs were selected as a typical example ( Supplementary Fig. 1). For the known gold NC synthesis protocol, the target of data mining is to determine the most frequently used concentrations of the surfactant (hexadecyltrimethylammonium bromide (CTAB)) and other agents (for example, AgNO 3 , HCl, l-ascorbic acid (AA), HAuCl 4 , NaBH 4 and gold seeds). Figure 3a and Supplementary Fig. 2 (in detail) show the frequency distribution of the concentrations of key synthesis parameters reported in 1,300 studies regarding this specific synthesis protocol. Among them, the papers with the most frequently used parameters are indicated with blue dots in Fig. 3a. Moreover, L2, corresponding to the highest-frequency parameter, is defined as the middle level, and a linear transformation (Supplementary Table 1) is applied to keep the L2 parameters at the same vertical position, as shown in Fig. 3b. By further fitting the curves obtained by Gaussian expansion, the boundaries of the shaded rectangle are determined as the low (L1) and high (L3) levels. Hence, L1, L2 and L3 are selected as key synthesis parameters for robotic execution of experiments.
Based on the data-mining results, the automated synthesis followed an orthogonal design, as shown in Supplementary Table 2. Ultraviolet-visible-near-infrared absorption spectroscopy ( Supplementary  Fig. 3) and multivariate analysis of variance (Supplementary Table 3) were performed. Twenty-four initial levels of experimental conditions   Table 4) were chosen for a further single-factor (adjusting one factor) study based on the results of the orthogonal experiments. To explore the potential impact of all single factors (CTAB, AgNO 3 , HCl, gold seeds, AA and HAuCl 4 ), 24 levels and 96 extended levels of high-throughput experiments were designed, as shown in Supplementary Table 4, and were carried out to construct a database. A photograph of 96 samples obtained from automated synthesis with the 96 extended levels of CTAB is displayed in Fig. 3c as an example, showing the gradually changing colour of the synthesized gold NCs. The corresponding optical absorption spectra are provided in Supplementary Fig. 4, showing the strong absorption of gold NCs in the visible region. These results suggest that adjusting the concentration of CTAB as a key synthesis parameter can lead to colour change and a peak shift of the longitudinal surface plasmon resonance (LSPR), indicating that these parameters can have a potential effect on morphology manipulation. Hence, a further study of morphology-controlled synthesis will be of great significance.
For the research topic of lead-free double-perovskite NCs, about which there is less published literature (although lead-containing perovskite materials have been widely studied), the target of data mining is to identify potential surfactants and solvents in the related literature for the synthesis of Cs 2 AgIn 1−x Bi x Cl 6 NCs as another typical demonstration ( Supplementary Fig. 1). The bismuth ions are doped into the cubic unit cell of the Cs 2 AgInCl 6 crystal to enhance the crystal quality and photoluminescence efficiency. This lead-free material was selected for its low toxicity and high photoluminescence efficiency. A probe of the potential influence of all solvents (Supplementary Table 5) and surfactants (Supplementary Table 6) with the initial choices obtained from data mining (Fig. 3d) was conducted on the platform.
First, in situ photoluminescence characterization (under irradiation by ultraviolet light with emission at 365 nm) of 48 solvents was conducted in an attempt to screen the double-perovskite samples with the highest photoluminescence efficiency. The images were obtained using a colour-ultrasensitive mobile camera, and the colours were analysed both qualitatively by visual inspection and quantitatively by digital analysis of the RGB values, as shown in Fig. 3e and Supplementary Fig. 5. The R values of the samples with ethanol (A4), ethyl acetate (B2), isopropanol (IPA, B4), diethyl ether (C3), acetic acid (C5) and 1,4-dioxane ( 2 B2) as the solvent exposed to ultraviolet irradiation were higher than 240, and their emissions were much brighter, as shown in Fig. 3e, indicating higher photoluminescence efficiency.
Second, with these six solvents selected, the potential role of 61 surfactants (obtained from data mining, as additives in 366 experiments) in the morphology tuning of NCs was investigated. The supernatant was extracted for photoluminescence analysis under ultraviolet irradiation since crystals of small sizes can be better dispersed in solution (which results in a colour difference in the solution), while larger crystals tend to quickly precipitate. For most samples with different additives, the photoluminescence colours show little difference due to the precipitation of larger crystals at the bottom. The dispersibility of the extracted supernatant of all the samples was characterized, as shown in Fig. 3f (the rest of the data are shown in Supplementary Fig. 6).
The results illustrate that the sample with polyvinyl pyrrolidone (PVP, circled well in Fig. 3f) as the surfactant exhibits the best dispersibility and strongest photoluminescence (R = 39), indicating the possible formation of smaller crystals, which is worth further investigating for controllable synthesis of crystals with a wide range of sizes.

Robot-assisted controllable synthesis
When using this robotic platform, the key to controllable synthesis is to establish correlations between SDAs for automated synthesis and the corresponding crystal morphologies on the nanoscale. To achieve this goal, in situ robotic synthesis and characterization, ML prediction and ex situ transmission electron microscopy (TEM) or scanning electron microscopy (SEM) characterization (for morphology validation) were combined (Fig. 4). The correlations between the SDAs and the morphologies were determined by constructing ML models based on a dataset generated from rapid in situ characterization and validated by a small dataset generated from ex situ characterization. Based on the initial results from data-driven robotic synthesis, controllable synthesis of the two typical examples (gold NCs and double-perovskite NCs) was further explored.
On the basis of single-factor experiments, the ML models identified correlations between the robotically in situ characterized LSPR and SDA content for gold NCs, which are presented in Supplementary   Fig. 7 (the corresponding coefficients are listed in Supplementary  Tables 7 and 8). CTAB, AgNO 3 and HCl showed greater effects on the LSPR than the gold seeds, AA and HAuCl 4 . Therefore, CTAB, AgNO 3 and HCl were identified as SDAs, which could be used as key synthesis parameters to control the morphology during robotic synthesis. For example, the LSPR peak shift in Supplementary Fig. 4 could be adjusted by the factor CTAB using this platform. As a result, the relationships between the SDAs and the characterized LSPR were verified to be the key to achieve controllable synthesis.
To further explore the manipulation mechanism, double-factor (that is, investigation of two identified SDAs) experiments were carried out (Supplementary Table 9). The LSPR was further extracted and then used to train the double-factor ML models. CTAB and AgNO 3 exhibited synergistic effects on the tunability of the LSPR (Fig. 4a), which is consistent with the observation that CTAB and Ag + form a face-specific capping agent for tuning the morphology 36,37 .
Furthermore, the morphological aspect ratio (AR) (Fig. 4b) was measured and calculated based on TEM results ( Fig. 4c and Supplementary Fig. 8). A linear relationship between the LSPR and morphological AR was observed, as shown in Fig. 4b. These results further confirm that the SDA-based parameter can be used as an input, while the AR can be used as the output, providing data useful for controllable synthesis of NCs. The double-factor ML models of AR versus SDA contents are presented in Supplementary Fig. 9 (the corresponding coefficients are listed in Supplementary Tables 10 and 11). Interestingly, CTAB and HCl in the double-factor experiment ( Supplementary  Fig. 9f) exhibited similar behaviour in morphology manipulation (in a

Fig. 4 | Controllable synthesis involving robotic in situ characterization, ML models and ex situ morphology validation. a,
Double-factor ML model of robotically in situ characterized LSPR versus SDA content for gold NCs (volume (V) and concentration (C)). b, Identified linear relationship between the LSPR and (AR (the results for c1-c4 are measured from c). c, TEM images for AR validation (more TEM images are presented in Supplementary Fig. 8). d, Double-factor ML model of robotically in situ characterized normalized absorption at 400 nm versus SDA content for double-perovskite NCs. e, Identified relationship between the normalized absorption at 400 nm and morphological size (the results for f1-f4 are measured from f). f. SEM images for size validation (more SEM images are presented in Supplementary Fig. 16).
Article https://doi.org/10.1038/s44160-023-00250-5 collaborative manner) to that of CTAB and AgNO 3 in their double-factor experiment ( Supplementary Fig. 9c). In contrast, for the double-factor experiment investigating AgNO 3 and HCl (parameters in Supplementary Table 11), AgNO 3 dominated the morphology manipulation ( Supplementary Fig. 9i). The design of triple-factor experiments (parameters in Supplementary Table 12) and the developed ML models are shown in Supplementary Tables 13 and 14. By varying three SDAs in the triple-factor experiments, a more complex AR response profile was observed, as shown in Supplementary Fig. 10. Therefore, the correlation between the SDAs and morphological AR in the controllable synthesis of gold NCs was confirmed.
For controllable synthesis of double-perovskite crystals, singlefactor and double-factor experiments were performed to tune the crystal size from microcrystals to NCs. PVP as a surfactant shows great promise in reducing the size of the double-perovskite crystals, as suggested by the results of the surfactant screening experiments (Fig. 3f). In addition, considering that the main ions might also contribute to the formation of crystals, single-factor experiments (parameters in Supplementary Table 15) were conducted to verify the effects of the CsCl, InCl 3 and BiCl 3 additives together with PVP as the surfactant based on the crystal structure ( Supplementary Fig. 11) and photoluminescence ( Supplementary Fig. 12) results. In the presence of PVP (0-1,000 μl) and BiCl 3 (0-100 μl), the pure phase of Cs 2 AgIn 1−x Bi x Cl 6 (0 ≤ x ≤ 1) (Supplementary Fig. 11g) was obtained, as confirmed by X-ray diffraction patterns ( Supplementary Fig. 11d-f). Moreover, the supernatants of the samples with varied PVP and BiCl 3 contents exhibited photoluminescence changes ( Supplementary Fig. 12). These experiments reveal the vital roles of PVP and BiCl 3 in the growth of Cs 2 AgIn 1−x Bi x Cl 6 double perovskites as SDAs.
Therefore, double-factor experiments (PVP and BiCl 3 contents) were designed to further investigate the factor effects in tuning the crystal size, as shown in Supplementary Table 16. Ultraviolet-visiblenear-infrared absorption spectra and colour analysis of the samples show similar trends, which can be elucidated by Mie scattering theory 38 .
In situ ultraviolet-visible-near-infrared absorption spectroscopy characterization was conducted ( Supplementary Fig. 13). The spectra were normalized according to the absorption peak, and the absorption intensity at 400 nm, as shown in Supplementary Fig. 14, was extracted as an indicator of the crystal size 39 . The double-factor ML model of the characterized absorption versus SDA content is presented in Fig. 4d, showing that the intensity at 400 nm decreased with increasing PVP and BiCl 3 contents (validated in Supplementary Fig. 15). The correlation between the normalized absorption at 400 nm and the crystal size was further evaluated, as shown in Fig. 4e. The size was validated by SEM characterization (down to 64 nm) ( Fig. 4f and Supplementary  Fig. 16). These results indicate that the size of the Cs 2 AgIn 1−x Bi x Cl 6 double-perovskite crystal can be manipulated by changing both the PVP and BiCl 3 contents ( Supplementary Fig. 17), which is consistent with the SEM validation and theoretical calculations ( Supplementary  Fig. 18). Hence, the correlations between the SDAs (PVP and BiCl 3 contents) as inputs and crystal sizes as the output for controllable synthesis of double-perovskite NCs were identified.

Morphology-oriented inverse design
Utilizing the data acquired from controllable synthesis, the robotic platform was further developed with the aim of inversely designing targeted NC morphologies based on the correlations between SDAs and the morphologies identified by ML models. This platform continues to be improved by receiving more robotically characterized data to realize the ultimate goal of morphology-oriented inverse design of NCs. With the aid of the platform, over 2,300 gold NC samples and 1,000 doubleperovskite NC samples were synthesized and in situ characterized; graphical diagrams of the databases are shown in Fig. 5a,b, respectively. The experimental database is considered to be crucial in supporting the inverse design process. At the same time, the results obtained from in situ colour characterization using the colour-ultrasensitive mobile camera contributed to the generation of another potential dataset for rapid inverse design. Colour information (RGB values) was extracted from photographs of gold NCs (Supplementary Fig. 19) and double-perovskite NCs (Supplementary Fig. 20). The corresponding RGB values are presented in Supplementary Tables 17 and 18, and the ML models of gold NCs and double-perovskite NCs were trained based on the normalized RGB values, as presented in Fig. 5c,d, respectively. ML models with R 2 values of 0.94 and 0.90 were obtained for gold and double-perovskite NCs (the formulas and coefficients of the models are provided in Supplementary Tables 19 and 20), respectively. The results indicate that colour analysis, which is typically achieved by visual inspection by a professional scientist, can serve as another indirect indicator for rapid morphological identification. In this way, the platform can be used for digitalization of colour in a similar way as a scientist but without bias, thus contributing to inverse design of NCs with colour features.
Based on the experimental database and ML models, the morphologies (AR or size) as 'input' and identified SDAs as 'output' are presented for gold NCs and double-perovskite NCs in Fig. 5e,f, respectively. In Fig. 5e,f, the effective ranges of the typical morphological AR (for gold NCs, from 1.90 to 6.00) and size (for double-perovskite NCs, down to 64 nm) are graphically displayed. Morphological control with broad tunability within these ranges is revealed for both gold NCs and doubleperovskite NCs. Finally, the achieved inverse design of a targeted NC morphology with the corresponding SDAs is also illustrated in Fig. 5e,f. The effective inverse design of gold NCs (Supplementary Fig. 21) and double-perovskite NCs ( Supplementary Fig. 22) shows the promise of robotic synthesis of targeted NCs. In particular, we demonstrate the successful synthesis of targeted gold NCs (AR = 4.06 ± 0.41) and of nanosized (78 nm) and microsized (749 nm) double perovskites using the inverse design strategy, as given in Supplementary Tables 21  and 22. Therefore, this study reveals that inverse design can be achieved by making use of databases (SDA-based parameters from robotic synthesis, robotic in situ characterization results and ex situ validation results) and the corresponding ML models.

Discussion
To address the limitations of automated synthesis and characterization of colloidal NCs with morphological control on the nanoscale, a robotic platform framework is established by integrating data mining, controllable synthesis and inverse design of targeted NC morphologies.
For automated synthesis, data mining of the literature was conducted to provide initial choices of key synthesis parameters, for example, the concentrations of the known surfactants for gold NCs and the types of unknown surfactants for lead-free double-perovskite NCs. High-throughput automated synthesis, such as single-factor and double-factor experiments, was then systematically conducted. In these processes, accessible large (in situ characterized ultraviolet-visible-near-infrared absorption spectra and RGB results) and small (ex situ TEM and SEM validations) datasets were generated to continuously expand the experimental databases. Through a sequence of iterations, the corresponding experimental database was constructed for training the ML models, which enabled controllable synthesis of morphologytunable NCs. These developed ML models could be used to identify the complex correlations between the SDAs and crystal morphologies in the controllable synthesis of gold NCs and double-perovskite NCs. The experimental databases and ML models are critical for supporting the inverse design process. Furthermore, inverse design of targeted morphologies with the ML-predicted SDA-based synthesis parameters (morphologies → SDAs) was successfully demonstrated for both gold NCs (with strong visible-light absorption) and double-perovskite NCs (with intensive photoluminescence).
Comprising data-driven robotic synthesis, robot-assisted controllable synthesis and morphology-oriented inverse design, this new synthetic framework was developed for synthesis of inorganic NCs. Training individuals to be highly qualified scientists with the expertise for conducting crystal synthesis and characterization on the nanoscale entails considerable cost. The outcomes of NC morphology manipulation can be diverse, as they depend heavily on the scientists' experiences. Moreover, most material synthesis and discovery involve trial-and-error synthesis and labour-intensive characterization 19,40 . The prototype of this robotic platform specifically demonstrated for synthesis and characterization of NCs is a start toward reducing the manual tasks. In the current work, initial selection of key synthesis parameters from a literature search, high-throughput synthesis and in situ characterization, synthesis parameter-crystal morphology correlation identification, and inverse design of morphology-tunable NCs were achieved, which are comparable to those performed by an experienced scientist in these fields. This synthetic approach is believed to be promising for digital synthesis of NCs from data to a crystal with the desired morphology using the robotic platform.

Data mining
The initial choices of key synthesis parameters were obtained by data mining the existing literature through our automated literature recommendation system to plan the automated synthesis. The whole process consists of several steps, as shown in Supplementary Fig. 1: (1) downloading literature from publishers by keywords; (2) using rules to locate target paragraphs; (3) splitting words and sentences in paragraphs by ChemicalTagger 41 ; (4) applying a four-step cascading tagger (chemical named entity recognition by OSCAR4 35 , additional recognition of chemical entities based on a domain dictionary, identification of the units of compound properties based on regular expressions and tagging of English parts of speech (POS) by OpenNLP Article https://doi.org/10.1038/s44160-023-00250-5 (https://opennlp.apache.org); (5) applying phrase parsing; 41 and (6) statistically analysing the distribution of the extracted key synthesis parameters. In this work, the workflow of the automated literature recommendation system was demonstrated with two typical topics: gold NCs and double-perovskite NCs (details are provided in the Supplementary Methods).

Robotic platform
The robotic platform was designed and assembled in our laboratory with a series of modules capable of performing robot-assisted high-throughput synthesis and in situ characterizations. Automated pipettes, sample and consumables storage, a synthesis platform, light sources, colour-ultrasensitive mobile cameras, and a microplate reader are integrated with a mobile robot and a robotic arm, as shown in Fig. 2. Based on the synthesis parameters obtained from data mining, the execution of automated experiments is programmed in Robotic Execution Excel files and then checked and monitored by a simulated operation system to conduct synthesis and characterization of NCs. Two typical operation videos for gold NCs (under white light for colour capture) and double-perovskite NCs (under ultraviolet light for fluorescence capture) are provided in Supplementary Videos 1 and 2. were pipetted into a well of a 96-well microplate by automatic pipettes. A 4 min mixing process was set for each addition of chemicals (except for the gold seeds). After that, 2.4 μl of preprepared seed solution was injected into the growth solution and then gently stirred for 10 s. The microplate was transferred to a furnace by the robotic arm and kept undisturbed at 28 °C for 12 h. After the growth of gold NCs for 12 h, the microplate was taken out of the furnace by the robotic arm. Next, 200 μl of the mixed solution was aspirated into a well of a 96-well transparent microplate for ultraviolet-visiblenear-infrared absorption measurement and colour characterization. Details of the synthesis parameters and experimental design for the gold NCs are given in the Supplementary Methods. A video showing the automated synthesis of the gold NCs is also provided in the Supplementary Video 1.

ML models
ML models were employed for identification of the correlation between the synthesis parameters and the corresponding morphologies from an experimental database, which were characterized by an ultraviolet-visible-near-infrared absorption spectrometer and a colour-ultrasensitive mobile camera installed on the robotic platform. The sure independence screening and sparsifying operator (SISSO) approach 42,43 , a supervised ML algorithm that is a compressed sensing-based approach, was used to determine the critical correlation. For the construction of the experimental feature spaces during ML training, the set of implemented operators was: opset ≡ {(+)(−)( * )(/)( ∧ -1)( ∧ 2)( ∧ 3)(exp)(log)(sqrt)(cbrt)(sin)(cos)} The detailed ML models with the corresponding coefficients are given in the Supplementary Information (Supplementary Tables 7, 8 , 10, 11, 13, 14, 19 and 20) for gold NCs and double-perovskite NCs.

Data availability
The data that support the findings of this study are available in the Sup- plementary

Code availability
The computer code, algorithm and related data to generate the results that are reported in this paper and are central to its main claims are available in the Zenodo repository with the digital object identifier: https://doi.org/10.5281/zenodo.7353405. The algorithms for chemical name entity recognition 35 , expressions and grammatical structures 41 , and SISSO 42,43 are adaptable as described in detail in Refs. 35,[41][42][43] .