Introduction

Nucleation is the first step of forming a stable new phase from a supersaturated medium. In this manuscript we focus on the nucleation of a solid phase from a supersaturated liquid solution. During the nucleation process, the smallest possible units of the solid phase, called nuclei, form in a supersaturated solution. These nuclei then evolve by growth, ripening, aggregation and agglomeration, phase transformation, and crystallization. The nucleation process, especially when it occurs at interfaces (i.e., heterogeneous nucleation), is both profoundly important and widely encountered in material synthesis1,2,3,4,5,6, battery operation7,8,9, cement hardening10,11, geochemistry2,12,13,14, geologic CO2 sequestration15, biomineralization16,17, industrial scaling control18,19, and drug production20. Understanding nucleation and quantifying the related thermodynamic and kinetic parameters are prerequisites for comprehensively describing, predicting, controlling, and fine tuning these systems.

In current numerical models that include solid phase formation, nucleation is usually approximated by precipitation on seeds of the secondary phase because the parameters required for numerically simulating nucleation process is lacking. However, the seeded model can miss important characteristics of nucleation, such as the high specific reactive surface area of nuclei and the rate-limiting role of nucleation14,21. To incorporate nucleation into simulations, we need a deeper understanding of its kinetic and thermodynamic parameters21,22,23,24.

An effective way of describing nucleation is presented by the nucleation rate equation:

$$J = J_0\exp \left( { - \frac{{{\mathrm{\Delta }}G^ \ast }}{{RT}}} \right) = A{\mathrm{exp}}\left( { - \frac{{E_{\mathrm{a}}}}{{RT}}} \right)\exp \left( { - \frac{{{\mathrm{\Delta }}G^ \ast }}{{RT}}} \right),$$
(1)

in which J is the nucleation rate with the unit of number, volume, or monomer-consumption cm−3 s−1 for homogeneous nucleation, or number, volume, or monomer-consumption cm−2 s−1 for heterogeneous nucleation. R is the ideal gas constant (8.3144598 J mol−1 K−1), T is temperature in Kelvin (K), and \({\mathrm{\Delta }}G^ \ast\) is the thermodynamic energy barrier (J mol−1) which is related to interfacial energies. J0 in Eq. (1) is attributed to the kinetics of the system, and can be expanded into \({\mathrm{exp}}\left( { - \frac{{E_{\mathrm{a}}}}{{RT}}} \right)\) (refs. 25,26,27,28), where A is the pre-exponential kinetic factor related to ion diffusion and nuclei surface properties, and Ea is the apparent activation energy (J mol−1) and therefore is the kinetic energy barrier. The mathematical derivation of Eq. (1) is based on an imagined pathway where nuclei are formed by addition of one monomer at a time, until the size of the nucleus is large enough to stabilize the nucleus as a new phase29. Despite the discovery of more realistic nucleation pathways in the past decade17,30,31,32, Eq. (1) has been found to be able to repeatedly capture nucleation kinetics25,31,33,34,35,36,37, and therefore can serve as an effective description in numerical models for evaluation and prediction of nucleation.

Until now, a comprehensive understanding of nucleation has been hindered by limited knowledge of both the thermodynamic and kinetic factors in Eq. (1), except for commonly used solubility products (Ksp) and mass densities. With regard to thermodynamic parameters, the interfacial energies have been reported for several common materials29,38. For example, Fernandez-Martinez et al.34 and Li et al.36 reported that the interfacial energies for nucleation of CaCO3 on silicates were as low as 35–50 mJ m−2. The overall system’s interfacial energy can be affected appreciably by individual interfacial energies among nuclei, substrates, and solutions13,33,35,36,39,40,41.

The kinetic parameters in Eq. (1), however, are surprisingly less known. Because information on J0 is lacking, most previous studies have assumed a constant J0 term25,33,34. To the authors’ knowledge, investigations of J0 are limited to one theoretical estimation and one experimental study at room temperature: Nielsen (1964)28 estimated J0 for homogeneous nucleation to be D/5d, where D is the diffusion coefficient of the monomers and d is the monomer diameter, and Wallace et al. reported J0 obtained from a in situ atomic force microscope (AFM) for silica nucleation on carboxyl and NH3+/COO-hybrid substrates to be 1013.5 ± 0.7 and 1014.8 ± 1.4 nuclei m−2 min−1, respectively. However, the range of J0 for other systems is seriously lacking, and there is no experimental quantification that leads to reliable estimation under various temperatures.

The lack of efforts to pursue J0 is partially attributed to the assumption that J0 is less influential for nucleation rates than are interfacial energies28, although two studies have already provided evidence that J0 alone can account for more than a 10-fold difference in the nucleation rate under certain conditions1,42. The importance of J0 brought up the necessity of quantifying Ea and A in order to calculate J0 and to predict the nucleation process reliably in order to fine-tune nucleation systems efficiently either by modifying interfacial energies or by altering kinetic barriers.

Here we describe an experimental study that uses in situ grazing incidence small-angle X-ray scattering (GISAXS) and ex situ AFM to determine Ea and A for more accurate estimation of J0 in Eq. (1) under different conditions. Heterogeneous CaCO3 nucleation at a water–quartz interface (12–31 °C, IAP/Ksp(calcite)  = 101.65) is employed as a model system, because this system involves two of the most common materials in natural and engineered systems, and because interfacial energies have been reported for exactly the same experimental setup36, so that system errors are minimized. The calculations result in an apparent activation energy Ea of 45 ± 7 kJ mol−1 and a pre-exponential kinetic factor A of 1012.0 ± 1.1 nuclei  × (μm−2 of quartz substrate surface area)  × min−1, or 102.9 ± 1.3 (moles of Ca2 +  or CO32− consumed from the solution) × (m−2 of quartz substrate surface area)  × min−1. The values for Ea and A can be directly applied to numerical models to simulate and optimize the targeted system, and the GISAXS-AFM methods developed in this study can be adapted to general systems that utilize GISAXS at interfaces.

Results

Apparent activation energy

Background-subtracted X-ray scattering intensities from heterogeneously formed CaCO3 nuclei are shown in Fig. 1. From numerical fitting of the GISAXS intensity as stated in the section “Methods”, we observed that under all the conditions, the radii of the nuclei were 4.7 ± 0.7 nm, without significant difference. Nucleus growth was not appreciable, and therefore the system was nucleation-dominant.

Fig. 1
figure 1

Plots of background-subtracted scattering intensities over the scattering vector q at selected time points. Panels ac are for 12, 25, and 31 °C systems, respectively. The data points at q higher than ~0.1 Å have low signal-to-noise ratios and are not considered in data analyses. Radii of gyration Rg were obtained from GISAXS intensity fitting, and did not evolve appreciably with time under all conditions. In all systems, the GISAXS intensity fitting shows similar nucleus sizes. The higher the temperature, the faster the nucleation and the shorter the induction time

The nucleus numbers in relative units (r.u.) obtained from GISAXS are plotted versus reaction times in Fig. 2. Nucleus numbers (r.u.) were extracted from two methods (as described in Methods): the invariant method, which assumes that nucleation is dominant over particle growth, plotted on the left axis; and the GISAXS intensity fitting method, which deconvolutes nucleation from particle growth but requires relatively high signal-to-noise ratios, plotted on the right axis. The results from the two methods are consistent with each other (i.e., they overlap for most parts) after appropriate linear scaling of the y-axes. Therefore, for early time points where the signal-to-noise ratio is too low to be fitted with the GISAXS intensity model, the invariant values can be used to forecast the trend of data from GISAXS fitting. The slopes from the linear regressions of the GISAXS-obtained nucleus numbers (r.u.) over reaction time were taken as nucleation rates (r.u.), because, unlike the invariant method which is influenced by particle growth (see Supplementary Figure 1), the fitting method can separate nucleation from nucleus growth (although not noticeable in our systems) and is considered more accurate. The intersection of the regressed lines with the x-axis were taken as the induction times. The logarithms of these nucleation rates were regressed over 1/T, as shown in Fig. 3, according to a rearrangement of Eq. (1):

$$\ln \left( J \right) = {\mathrm{ln}}(A) - \frac{{{\mathrm{\Delta }}G^ \ast + E_{\mathrm{a}}}}{{RT}}.$$
(2)
Fig. 2
figure 2

Representative plots of invariant values and fitted nuclei numbers with respect to reaction times. Invariant values are plotted in open circles, and the fitted nucleus numbers are plotted in closed circles. The relative unit on the left axis is not the same with that on the right axis. The dotted lines are from linear regressions of the fitted nucleus numbers over reaction times. The slopes of these regressed lines are taken as the nucleation rates, J

Fig. 3
figure 3

Plot of ln(J) versus 1/T. The linear regression yields a R2 of 0.9741, and the resulting slope is used to calculate the sum of (ΔG* + Ea)

The resulting ΔG*+Ea was 61.5 ± 5.8 kJ mol−1. To obtain Ea, the value of ΔG* was calculated according to

$${\mathrm{\Delta }}G^ \ast = \frac{{16\pi \upsilon ^2\alpha ^3}}{{3R^2T^2\left[ {\ln \left( {\frac{{{\mathrm {IAP}}}}{{K_{{\mathrm{sp}}}}}} \right)} \right]^2}}.$$
(3)

In Eq. (3), υ is the molar volume of nuclei (cm3 mol−1) and can be estimated using the density and molecular weight of the nucleating material. The constant 16π/3 is a geometry factor from the mathematical derivation for homogeneous nucleation of spherical particles29, with α representing the interfacial energy between nuclei and fluid. However, note that for heterogeneous nucleation, 16π/3 is no longer a geometry factor but becomes a numerical constant. It is used in the heterogeneous nucleation case not to suggest spherical nuclei, but to facilitate comparison between the interfacial energy in homogeneous nucleation and the effective interfacial energy in heterogeneous nucleation. The complex relationship among the nuclei geometry, contact angle of nuclei on substrates, and the interfacial energies among nuclei, substrate, and liquid is accounted for in α, the effective interfacial energy of the system. In our previous studies, the α at room temperature for CaCO3 nucleation on quartz was experimentally found to be 47.1 ± 1.3 mJ m−2(refs. 1,36). IAP is the ion activity product (Ca2+)(CO32−), and Ksp is the solubility product of the nucleating CaCO3 phase. In this study, the reaction solutions were supersaturated with calcite and vaterite, and were undersaturated with amorphous CaCO3. To avoid unnecessary complication with CaCO3 phase transformation and to serve as a direct reference for numerical models where calcite is included as the CaCO3 phase, our calculation used υ and Ksp for calcite. Calculations for vaterite can be found in Supplementary Note 1 and Supplementary Table 1. With these parameters, the thermodynamic energy barrier, ΔG*, was calculated as 16 ± 3 kJ mol−1 in the temperature range of 12–31 °C (15.5 ± 1.3 kJ mol−1 for 12 °C, 16.2 ± 1.3 kJ mol−1 for 25 °C, and 18.6 ± 1.5 kJ mol−1 for 31 °C). Subtraction of ΔG* from the sum of (ΔG* + Ea) gave the kinetic energy barrier, Ea, equal to 45 ± 7 kJ mol−1.

Pre-exponential kinetic factor

For nucleating materials with high scattering length density and in vacuum or an air background, the high signal-to-noise ratio of GISAXS signals can be numerically fitted with geometrical modeling43. On the other hand, for nucleation in liquid systems, especially those where water acts as a reaction medium, the GISAXS signals after background subtraction are typically low and often do not support geometrical modeling. In these cases, the absolute nucleus numbers can be achieved by straightforward calibration of GISAXS data with AFM images. Representative AFM images of heterogeneously formed nuclei are shown in Fig. 4. Large particles were homogeneously formed and settled on the substrate, and were not the focus of this study. Small particles were heterogeneously formed and evenly distributed over the whole substrate (also see Supplementary Figure 2). These small nuclei were manually counted within a unit area of one square micrometer. The GISAXS-obtained particle numbers (r.u.) under the same conditions and at the same reaction time can be read from Fig. 2 on the right y-axis. Fig. 5 plots nuclei numbers counted from AFM images versus GISAXS-obtained nuclei numbers (r.u.). Each data point in Fig. 5 was generated from at least three 1 μm2 areas within one piece of quartz substrate. The data points in Fig. 5 are scattered because the induction time for 12 °C and 25 °C samples had a typical 20–30 min uncertainty.

Fig. 4
figure 4

Representative AFM images of nuclei formed at different temperatures. Each image is a 1 × 1 μm2 area. The evenly distributed particles formed heterogeneously, while the larger particles formed homogeneously and settled to the quartz substrate surface

Fig. 5
figure 5

Calibration of GISAXS results with AFM imaging. The nucleus numbers counted from AFM images with units of nuclei μm−2 are plotted versus the nucleus numbers in relative units obtained from fitting the GISAXS data. Error bars for the data points are particle number deviations from three 1 × 1 μm2 scan areas on a single substrate. The linear fitting line is known to cross (0,0), where no nuclei were formed on substrate, and no GISAXS signal should be observed. The correction factor from x-axis values to y-axis values is (1.77 ± 0.35) × 105

In Fig. 5, regression of the particle numbers over the GISAXS-obtained particle numbers (r.u.) intersects at the point (0,0), where no GISAXS signals from nuclei can be observed when there are no nuclei formed on the substrate. This regression provided the correction factor from relative nuclei numbers obtained from GISAXS intensity fitting to absolute numbers, specifically (1.77 ± 0.35) × 105. Despite the uncertainty introduced by the induction time, the narrow standard deviation of the correction factor indicates the validity of this calibration. The validity is also demonstrated by the fact that the regression, obtained from all the data points, well captures the data points with the least uncertainty in induction time (i.e., data at 31 °C). With this correction factor, the nucleation rates from GISAXS data were corrected from relative units to absolute units of nuclei μm−2 min−1 (Table 1). The absolute value of the pre-exponential kinetic factor, A, can also be calculated in a similar way, according to the regression result (ln(A) = 15.5 ± 2.4) shown in Fig. 3. The value of A is 9.4 × 1011 nuclei μm−2 min−1, or 1.6 × 1022 nuclei m−2 s−1, with one order of magnitude uncertainty.

Table 1 Nucleation rates and pre-exponential kinetic factors obtained in this study expressed in different units

To compare with J0 values (1013.5 ± 0.7 and 1014.8 ± 1.4 nuclei m−2 min−1 at room temperature) for heterogeneous silica nucleation reported by Wallace et al. (2009), we employed the values of A and Ea obtained in this study and calculated J0=A exp(−Ea/RT) for heterogeneous CaCO3 on clean quartz to be 1016.1 ± 1.0 nuclei m−2 min−1 at 25 °C. The higher J0 for CaCO3 nucleation is consistent with the observation that CaCO3 nucleation is faster than silica nucleation. The comparison provides a sense on the variation of J0 for different nucleating materials.

Nucleation rates expressed as concentration changes

Because reaction rates in aqueous solutions are often expressed with moles, the convenient unit for nucleation is not the nuclei number per unit area of substrate per unit time (e.g., nuclei μm−2 min−1), but moles of the reactant consumed per unit area of substrate per unit time (e.g., mol μm−2 min−1). Using the lateral nucleus radius value of 4.7 ± 0.7 nm obtained from GISAXS, the individual nucleus volume was calculated as 34.5 ± 15.4 nm3. This calculation employs the calculation method and nucleus geometry obtained in our previous study1,36, where an individual nucleus resembles the top section of an ellipsoid, with a contact angle on quartz <90o and a fixed ratio of nucleus height to lateral radius equal to 1/6. The relative standard deviation of nuclei volumes is expected to be reduced for materials that generate larger nuclei, because dimensions of large nuclei can be measured more accurately. Multiplication of individual nucleus volumes (m3) and nucleation rates (nuclei m−2 s−1) gives nucleation rates in units of the volume of nuclei per unit area of substrate surface per unit time (i.e., m3 m−2 s−1). If the CaCO3 phase is calcite, as commonly used in reactive transport modeling approaches, the moles of Ca2+ or CO32− ions consumed from the solution can be calculated by dividing the volume nucleation rates (m3 m−2 s−1) by the molar volume of calcite (m3 mol−1), where the molar volume is just the product of the reciprocal of calcite density (g cm−3) and the molar weight (g mol−1) of calcite:

$$\upsilon = \frac{1}{\rho } \times MW.$$
(4)

If the nucleating CaCO3 phase is assumed to be other than calcite, the Ksp and molecular volume should correspond to that specific phase, but the methods for obtaining Ea and A are the same as presented for calcite. The results for vaterite are available in Supplementary Note 1 and Supplementary Table 1. The obtained nucleation rate was in moles of Ca2+ or CO32− ions consumed per unit area of substrate surface per unit time (i.e., m3 m−2 s−1). Because the pre-exponential kinetic factor, A, has the same units as the nucleation rate J, the unit conversion method for A is the same as that for J. The calculated values for J and A with different units are shown in Table 1.

For heterogeneous nucleation, the geometry of the nuclei, including the nucleus contact angle on substrates and the ratio of nucleus high to lateral radius, is accounted for by the effective interfacial energy α. Because the α-value of 47 mJ m−2 used for calculation in this study is obtained from direct observation of CaCO3 on quartz using the same setup and under similar conditions36, it has already incorporated the information of the nucleus geometry which does not need to be specifically determined. Therefore, there is no need to doubly consider nucleus geometry when calculating A (nuclei μm−2 min−1), Ea (kJ mol−1), and J (nuclei μm−2 min−1) using this α-value. However, when converting A and J to units of nm3 μm−2 min−1 or mol m−2 min−1, the geometry of the nuclei will influence the results by affecting the individual nucleus volume. For example, if the ratio of nucleus height to lateral radius doubles from 1/6 to 1/3, the volume of an individual nucleus will increase by 90% from 34.5 ± 15.4 to 65.5 ± 29.3 nm3, resulting in a 90% increase in the values of A and J with units of nm3 μm−2 min−1 and mol m−2 min−1, while A and J in other units in Table 1 remain unchanged.

Sensitivity analyses for thermodynamic and kinetic factors

The overall energy barrier for nucleation is a combination of the kinetic energy barrier (Ea) and the thermodynamic energy barrier (ΔG*). ΔG* is largely determined by supersaturations and interfacial energies, whereas Ea is related to kinetic factors such as the rate of monomer diffusion in solution and on the substrate, the impingement rate of monomers onto the substrate, and the incorporation rate of monomers into existing nuclei44. Because Ea and ΔG* are both Arrhenius-type energy barriers according to Eq. (1), it has been hard to estimate the nucleation rate as a function of temperature without a reliable Ea value. With Ea quantified, it is now possible to analyze the nucleation rate as a function of temperature. Taking the conditions in this study as an example (Fig. 6a, α = 47 mJ m−2, IAP/Ksp = 101.65, Ea = 45 kJ mol−1, and ΔG*(25 °C) = 16 kJ mol−1), the CaCO3 nucleation rate is shown to increase with a decrease in either α or Ea, or an increase in log10(IAP/Ksp), according to Eqs. (1) and (3), for the temperature range of 10–50 °C. The amount of change is set to 20% to allow clear comparison. In this case the nucleation rate is effectively enhanced by tuning any of the three parameters. However, for a system with an interfacial energy as low as 30 mJ m−2 (Fig. 6b, IAP/Ksp = 101.65, Ea = 45 kJ mol−1, and ΔG*(25 °C) = 4 kJ mol−1), further reducing the interfacial energy or enhancing supersaturation does not increase the nucleation rate as effectively as reducing the kinetic barrier (Ea). Conversely, if the interfacial energy is 60 mJ m−2 (Fig. 6c, IAP/Ksp = 101.65, Ea = 45 kJ mol−1, and ΔG*(25 °C) = 34 kJ mol−1), reducing the interfacial energy will be the most effective way (compared to reducing Ea or enhancing supersaturation) to enhance the nucleation rate. Such sensitivity analyses reveal the relative importance of kinetic and thermodynamic factors, and can potentially guide prediction and optimization of systems where nucleation takes place.

Fig. 6
figure 6

Sensitivity analyses of nucleation rate for CaCO3 nucleation on quartz. The variation in nucleation rates is manifested by plots of ln(J/A) as a function of temperature. Panel a shows the base case with α = 47 mJ m−2, as for CaCO3 nucleation on quartz, and panels b and c are for base cases with α = 30 and 60 mJ m−2, respectively. In all base cases, log10(IAP/Ksp) = 1.65, and Ea = 45 kJ mol−1. Increase in ln(J/A) is calculated with either 20% reduction in α, 20% increase in log10(IAP/Ksp), or 20% reduction in Ea, and is compared with the base case. (a) When α = 47 mJ m2, the nucleation rate at all temperatures is effectively enhanced by 20% variation in α, log10(IAP/Ksp), or Ea. (b) If α = 30 mJ m−2, the nucleation rate is most effectively enhanced by a decrease in Ea. (c) If α = 60 mJ m−2, a reduction in α will significantly increase the nucleation rate

Methods

Substrate preparation

Environmentally abundant quartz was used as the substrate for heterogeneous nucleation experiments. Atomically flat quartz substrates with a polished (100) plane (roughness <5 Å) were purchased from MTI Corporation. The substrates were cut into 1 cm × 1 cm squares and cleaned by sonication in acetone, ethanol, isopropanol, and ultrapure ionized water (DI water, 18.2 MΩ cm−1) successively for 20 min each. Then they were soaked in a mixture of sulfuric acid and Nochromix® for 2 h to remove any remaining organic compounds. The substrates were then rinsed thoroughly with DI water and stored in DI water for experimental use. The cleaned substrates were scanned using AFM to ensure the cleanness of the surface.

Aqueous chemistry

To generate a supersaturated solution with respect to CaCO3 at constant supersaturation, we used a flow-through system. A NaHCO3 solution and a CaCl2 solution were separately driven by two peristaltic pumps into a micro-mixer to create a supersaturated solution. Using Geochemist’s Workbench (GWB, Release 8.0, RockWare, Inc.)45 and the thermo_minteq database, we quantified the supersaturation of the mixed solution to be IAP/Ksp = 101.65 at 12, 25, and 31 °C. These conditions were chosen so that heterogeneous nucleation occurs within the experimentally detectable window of both GISAXS and AFM within a reasonable reaction time. More discussion about the choice of conditions for the GISAXS-AFM method is available in Supplementary Note 2. Homogeneous nuclei are much larger than heterogeneous nuclei and much less numerous (Fig. 4), and thus can be excluded during GISAXS and AFM data analysis. To quantify the aqueous supersaturation, the solubility product of calcite (Ksp = 10−8.42 at 12 °C, Ksp = 10−8.48 at 25 °C, and Ksp = 10−8.52 at 31 °C) was used for calculation45. The mixed solution was also oversaturated with vaterite (see Supplementary Note 1), but was undersaturated with amorphous CaCO336,46. The supersaturated solution was then injected into our reaction cell (2 mL volume) at a volumetric flow rate of 5.6 mL min−1. The top surface of a piece of quartz substrate on the bottom of the reaction cell was in contact with the freshly injected solution. Waste solution was exhausted from the top of the cell and collected for disposal. A diagram of the experimental setup is shown in Supplementary Figure 3.

To obtain the same supersaturation of the mixed solution at different temperatures, the concentration of CaCl2 was tuned slightly to maintain a supersaturation fixed at IAP/Ksp  = 101.65. The ratio of Ca/HCO3was kept at 3.6–4.4 for all conditions, and the pH was within a 7.8–8.0 range. To control the system temperature, before mixing, the solutions of CaCl2 and NaHCO3 were either heated in the tubing using a heating plate, or cooled with a water/ice mixture. To minimize heat loss through tubing walls, the tubes after the temperature control component were covered with insulating foam. The temperature of each supersaturated solution was measured before and after the reaction at the inlet of the reaction cell, and the temperature fluctuation was less than 1 °C from the target value.

In situ GISAXS

In situ GISAXS data were collected at beamline 12-ID-B at the Advanced Photon Source (Argonne National Laboratory, USA). The reaction cell was aligned in the beamline. The front and back wall of the cell was made of Kapton film, allowing transmission of X-rays. A 14 keV X-ray beam (200 μm in width and 20 μm in height) incident onto the substrate at an angle of 0.11o (which gives a reflectivity of 98.8% for quartz) was scattered by particles on the substrate. The scattered X-rays were collected by a Pilatus 2M detector 2 m away from the sample, downstream of the beam. The scattering intensity from the particles on the quartz surface was obtained from in-plane cutting along the Yoneda wing in the same way as described in our previous studies36. After background subtract, the intensity (I) was plotted versus the scattering vector (q). The scattering vector is in the reciprocal space of the horizontal dimension of the particles, and was 0.008–0.4 Å−1 in our setup. This corresponds to an observable particle’s lateral radius range of 0.8–40 nm, well capturing the in-plane heterogeneous CaCO3 nucleus radius which is usually several nanometers. Scattering with q values larger than 0.1 Å−1 has low signal-to-noise ratios and is largely affected by background scattering, and therefore was not used for analyses. After obtaining the I(q) plots, we used the conventional SAXS formula for further analyses, because the nucleus radius in our system (about 5 nm) is small enough (R < 4.5λ/4παi) that the GISAXS-specific formula derived from distorted-wave Born approximation is not necessary47. The total volume of heterogeneously formed nuclei with radii within 3–30 nm was represented by invariant (Q), where \(Q = \frac{1}{{2\pi ^2}}\mathop {\int }\nolimits I(q)q^2{\mathrm{d}}q\) for \(q\) = 0.01–0.1 Å−1(refs. 36,39). For a nucleation-dominant system where nucleus size does not change significantly, the invariant is proportional to the nucleus number, and can be treated as the total nucleus number with a relative unit comparable to all invariant values calculated in this study48. The invariant method can be applied to the samples with a signal-to-noise ratio too low to be fitted with the GISAXS fitting method.

For samples with relatively high signal-to-noise ratios, we fitted the GISAXS intensity according to36,39,48,49

$$I\left( q \right) = N \cdot {\mathrm{\Delta }}\rho ^2 \cdot \mathop {\int }\nolimits D\left( R \right)V^2\left( R \right)P\left( {q,R} \right){\mathrm{d}}R \cdot \left[ {I_{{\mathrm{pow}}}q^{ - p} + S\left( q \right)} \right],$$
(5)

where N is the nucleus number, Δρ is the difference of the scattering length density between nuclei and the solution background, D(R) is the nucleus size distribution, V(R) is the volume of nuclei with an in-plane radius of R, and P(q,R) is the form factor which in this case is written for spherical particles. The Ipowq−P term expresses the intensity contribution from large surfaces such as those of aggregates and large particles settled from solution. The structure factor S(q) is equal to one for our dilute system. Because of the uncertainty of the properties of newly formed CaCO3 nuclei in terms of water content and polymorphy, the absolute value of Δρ is hard to estimate, and is assumed constant and assigned a value of 1, yielding fitting results of N proportional to the actual particle numbers. Thus, the fitted N can be treated as particle numbers with a relative unit comparable to any fitted N values throughout this study36. The nucleus numbers obtained from the invariant method and fitting method are in different units, but are consistent with each other after linear scaling36, as shown in Fig. 2. The rates of increases in the nucleus number (obtained from the fitting method) were recorded as nucleation rates, with relative units. The variation of these nucleation rates obtained at different temperatures was used to calculate the apparent activation energy, Ea.

Assuming a Schultz distribution D(R) for a polydisperse system, the radius of gyration can be calculated as Rg. In our systems, the nucleus size did not vary significantly, which further justifies the idea that the invariant values that are proportional to the total nucleus volume are also proportional to the nucleus numbers.

Ex situ atomic force microscopy

To calibrate the nucleation rates in relative unites obtained with GISAXS with absolute nucleation rates, and to visually observe nanometer-scale nuclei, we complemented the GISAXS experiment with ex situ AFM experiments. The experiments were conducted as described in the GISAXS section, except that the reaction was ended at different time points. At each desired reaction time, the substrate was taken out of the cell. The residual reaction solution was immediately rinsed off with ethanol to end the reaction. Ethanol (100%) was used instead of water to minimize dissolution of CaCO3 nuclei. The substrate was then dried with ultrapure nitrogen gas, and immediately scanned using AFM to minimize aging of the nuclei. Only the evenly distributed particles on the surface were counted to obtain the nuclei density (nuclei μm−2), because even distribution occurs mainly in systems where epitaxial heterogeneous nucleation takes place on the substrate. The larger, randomly distributed particles were formed in solution and settled on the surface (i.e., homogeneously nucleated particles). The manually counted nucleus densities per unit area were used to calibrate the nucleation rate (r.u.) obtained with GISAXS. Tapping mode was used to collect images. AFM probes used in this study were purchased from Bruker (Model: RTESP, Part: MPP-11100-10). Because the AFM tip radius (8 nm) is comparable to the lateral nuclei radius, and because CaCO3 nuclei might have experienced dehydration after the substrate was taken out of the reaction solution, we did not use AFM images to extract lateral nucleus sizes. In this study, AFM imaging provided only nuclei number densities on the substrate.