An understanding of the coupling between clouds and atmospheric circulation—one of the World Climate Research Programme’s seven Grand Challenges—is a crucial missing link for constraining estimates of cloud feedback, that is, the response of clouds to a warming climate1,2. Cloud feedback estimates, especially those associated with low clouds, constitute one of the largest uncertainties in current assessments of climate sensitivity3,4. The link between circulation and moisture variance at mesoscales (\({{{\mathcal{O}}}}\)(100 km, 1 h)) influences the amount of trade cumulus clouds5,6 as well as their spatial organization7. Both aspects are crucial for low-cloud feedback6,8,9. Idealized large domain large-eddy simulations (LES) show that the spatial organization of clouds is coupled to shallow overturning circulations, which create moist and dry anomalies in their ascending and descending branches, respectively10,11,12. Such mesoscale circulations cannot be explicitly resolved by the coarse-resolution global climate models, nor are they represented in these models’ cloud parameterizations6, which have been designed without contributions from these scales of motion in mind. This increases interest in determining if such circulations are evident in nature and, if so, just how prevalent they are.

Recently, the field campaign EUREC4A (Elucidating the Role of Cloud–Circulation Coupling in Climate13,14) made extensive measurements of mesoscale horizontal divergence (\({{{\mathcal{D}}}}\)), making it possible to explore the presence of such circulations and thus test inferences from modelling. The \({{{\mathcal{D}}}}\) measurements are samples averaged over a ~220-km-diameter circle for ~1 h in the North Atlantic trades14,15,16, hereforth referred to as circles (Fig. 1a,d). We analyse 65 circles from 11 flights spread over 4 weeks in January–February 2020. As shown in Fig. 1a, flight day typically included two circling sets (three consecutive circles) separated by an hour. In this Article, using EUREC4A measurements we: (1) present observational evidence for shallow mesoscale overturning circulations (SMOCs hereafter), (2) characterize their spatial scales and frequency of occurrence with help from meteorological re-analysis, and (3) propose a mechanism by which SMOCs amplify moisture variance.

Fig. 1: Divergence and humidity measurements from EUREC4A .
figure 1

a, Top view of the HALO aircraft flying a circle with markers representing launch location of dropsondes. b,e, Vertical profiles of divergence \({{{\mathcal{D}}}}\) (b) and specific humidity q (e) averaged over EUREC4A circles. c,f, Anomalies of \({{{\mathcal{D}}}}\) (c) and q (f) from time mean (\({{{{\mathcal{D}}}}}^{{\prime} }\) and \({q}^{{\prime} }\)) are shown as hues. Descriptions of terms explaining the sampling strategy (circle, circling set and flight day) are for typical samples. Deviations in some cases are detailed in refs. 5,15. d, A side-view depiction of multiple dropsondes (i, j, k) in flight.

Observations of shallow mesoscale circulations

The campaign mean \({{{\mathcal{D}}}}\) (Fig. 1b) is consistent with the theoretical understanding of the trades being on average a region of weak subsidence (ω) (ref. 17). With negligible horizontal temperature advection under the weak temperature gradient approximation18, adiabatic warming due to this weak subsidence must on average balance the radiative cooling in the trades. In the free troposphere, the EUREC4A mean ω (~24 hPa per day) balances a mean cooling of ~1.3 K per day, which is consistent with observed climatological cooling rates in the trades19,20. Below the free troposphere, in the trade-wind layer (from surface up to ~2.3 km), \({{{\mathcal{D}}}}\) increases from the surface upwards and is then roughly constant through the bulk of the layer. This vertical coherence, however, is restricted to the campaign- mean and thus representative only of the larger synoptic scale.

At shorter timescales, ranging from the circle scale to the flight-day scale, \({{{\mathcal{D}}}}\) departs markedly from campaign mean (Fig. 1c) indicating large vertical velocities unbalanced by radiation. The divergence anomaly (\({{{{\mathcal{D}}}}}^{{\prime} }\)) also changes sign between the subcloud and cloud layers. Averaged over circling sets and flight days (~3 and ~6–7 h, respectively), we find an anti-correlation between \({{{{\mathcal{D}}}}}^{{\prime} }\) averaged over the subcloud (\({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\)) and cloud layer (\({{{{\mathcal{D}}}}}_{{{{\rm{c}}}}}^{{\prime} }\)) (Fig. 2a). The anomalies from the mean, with few exceptions, are large enough to stand as cases of absolute convergence and divergence. Thus, when there is convergence in the subcloud layer, air diverges in the cloud layer and vice versa. The prevalence of this \({{{{\mathcal{D}}}}}^{{\prime} }\) dipole in the lower atmosphere indicates the presence of shallow overturning circulations, with circles sampling either ascending or descending branches. Given EUREC4A’s unbiased sampling and the sign changes in \({{{\mathcal{D}}}}\) over consecutive flights, we believe that the dipole is a mesoscale feature that is almost always apparent.

Fig. 2: Relationships with subcloud layer divergence.
figure 2

ac, Scatter plots against \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) of \({{{{\mathcal{D}}}}}_{{{{\rm{c}}}}}^{{\prime} }\) (a), \({q}_{{{{\rm{sc}}}}}^{{\prime} }\) (b) and \({q}_{{{{\rm{cb}}}}}^{{\prime} }\) (c). Subscripts ‘sc’, ‘cb’ and ‘c’ stand for averaging over subcloud (0–600 m), cloud-base (600–900 m) and cloud (900–1,500 m) layers, respectively. Cross hairs show the standard deviation (sample size n = 6) in the mean along altitude. r values indicate Pearson’s correlation coefficient for flight-day means (solid pink) and circling-set means (hollow grey).

We investigate the vertical structure of these circulations, by analysing composites of the lowest and highest quartiles of \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) (Fig. 3a–d). To distinguish the circulation features, analyses in Fig. 3 exclude data from 24 January 2020, the only day with flight-day mean missing the \({{{{\mathcal{D}}}}}^{{\prime} }\) dipole (data point in lower-left quadrant in Fig. 2a). Figure 3a,b suggests that the circulations are shallow, being largely confined to the trade-wind layer (lower ~2.3 km). The shallowness is made further evident by the fact that the strongest anti-correlation of \({{{\mathcal{D}}}}\) with \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) happens within and throughout the cloud layer (Fig. 3e). This shallowness is not unexpected given the large values of \({{{{\mathcal{D}}}}}^{{\prime} }\) (Fig. 3a), which if maintained over a deeper layer would imply much larger \({\omega }^{{\prime} }\). Even for circulations as shallow as those observed, \({\omega }^{{\prime} }\) goes up to 3 hPa h−1 (Fig. 3b), which, if sustained over a period of a day, would imply displacements of ~670 m per day. If not compensated by adjacent branches of similar magnitude, such large displacements would lead to large pressure gradients and a deep saturated layer in the ascending branch, both of which are inconsistent with the shallow convective nature of the wintertime trades.

Fig. 3: Quartile composites and correlations with subcloud divergence.
figure 3

ad, Averaged profiles of anomalies of \({{{\mathcal{D}}}}\) (a) subsidence ω (b), q (c) and net longwave radiative heating rate \({Q}_{{{{\rm{LW}}}}}^{{\prime} }\) (d) are shown for the lowest (Q1, strongest convergence) and highest (Q4, strongest divergence) quartiles of \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) . Dotted lines in ad show the interquartile range (IQR) for Q1 and Q4. e,f, Vertical profiles of Pearson’s correlation coefficients (r value) are shown between \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) and \({{{\mathcal{D}}}}\) (e) and \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) and q (f). Dashed lines show correlation from flight-day averages (FDavg), whereas the coloured profiles show correlation from circle scale, but \({{{\mathcal{D}}}}\) lagging \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) in time as indicated in the legend. Profiles exclude circles from flight on 24 January 2020. Sample sizes are provided in parentheses in the legends for means and IQR (ad) and for correlations (e and f).

Ubiquity and spatial scale of SMOCs

To further test the idea that the circulations are mesoscale, we look into the European Centre for Medium-Range Weather Forecasts re-analysis product (ERA5)21 over a 10° × 10° domain, with instantaneous values of horizontal divergence of wind velocity available at 0.25° spatial and 1 h temporal intervals. Re-analyses are thought to be reliable only for their synoptic reconstruction of divergence (for example, refs. 22,23). However, ERA5 turns out to reproduce mesoscale \({{{\mathcal{D}}}}\) from the EUREC4A measurements in the lowest ~2.5 km (Extended Data Fig. 1), and it does so independent of the assimilation of EUREC4A soundings (Methods). This ability of ERA5 to reproduce mesoscale \({{{\mathcal{D}}}}\) is probably due to the assimilation of scatterometer winds at the ocean surface and therefore presumably not limited to the EUREC4A region and period.

ERA5’s ability to capture \({{{\mathcal{D}}}}\) allows us to investigate SMOCs’ occurence and spatial coverage. Similar to the measurements, we identify SMOCs in ERA5, by selecting grid points with a \({{{{\mathcal{D}}}}}^{{\prime} }\) dipole. We then cluster such grid points into SMOC objects and quantify the shape, size and orientation of these objects by fitting them to equivalent ellipses (Methods and Fig. 4a,b). Strikingly, SMOCs cover a large fraction of the domain in Fig. 4a,b. We see a similar spatial prevalence of SMOCs for the entire EUREC4A period: 58 ± 7% of the domain is covered by SMOCs (also see Extended Data Fig. 2). The prevalence of the \({{{{\mathcal{D}}}}}^{{\prime} }\) dipole in circles, combined with the spatio-temporal omnipresence of SMOCs in ERA5, shows that SMOCs are ubiquitous in the North Atlantic trades. Applying the same analysis to the north-east Pacific trades shows similar statistics (Extended Data Table 1)17, leading us to infer the ubiquity of such features across different trade-wind regions.

Fig. 4: Scale and orientation of SMOC objects in re-analyses.
figure 4

a,b, A typical snapshot of ERA5 \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) for a 10° × 10° domain (14 February 2020 09:00 UTC). Overlaid streamlines (a) show horizontal wind in the subcloud layer; thicker lines indicate stronger winds. The circle (teal) indicates the EUREC4A circle. Similar \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) maps at 12 h snapshots for January–February 2020 are shown in Extended Data Fig. 2. Shading (b) indicates convergent (blue) and divergent (red) clusters, with the centroid (grey circle), major axis (pink dashed) and minor axis (green dashed) shown for the SMOC objects (for details, see Methods). c, Gaussian-kernel probability density function (PDF; bin width ~2 km) of major axis length (pink), minor axis length (green) and effective diameter (deff; black) for all SMOCs objects (sample size n = 21,075) detected in the same domain every hour during the EUREC4A period. Box plots above show median (line in box), first and third quartiles (ends of box) and 5th and 95th percentiles (ends of whiskers). Lengths (in km) are derived with the approximation that 1°  100 km. d, PDF (bin width π/150) of orientation of SMOC objects weighted by their area, with 0 indicating parallel and π/2 indicating tangential alignment of the major axis.

Figure 4c shows the distribution of the major and minor axes lengths and effective diameters (deff) of SMOC objects for the EUREC4A period. The median values of all three lengths lie between 80 km and 200 km, quantifying the size of these circulations’ branches. This spatial scale derived from ERA5 fits well with the scale estimated from the measurements. The correlation of \({{{\mathcal{D}}}}\) with \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) (Fig. 3e) shows that SMOCs persist for longer than 1 h, as the peak anti-correlation between \({{{{\mathcal{D}}}}}_{{{{\rm{c}}}}}\) and \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) occurs 2–3 h apart, with \({{{{\mathcal{D}}}}}_{{{{\rm{c}}}}}\) lagging \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) . Considering 9 m s−1 winds, air masses would traverse the circle in ~7 h (Fig. 5) and flight-day measurements spanned ~8 h. Hence, if SMOCs are of similar spatial scales as in Fig. 4c, one flight would sample only one branch of the circulation, which is consistent with what we observe, as \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) rarely changes sign through the course of a flight day (Fig. 1c). These spatial scales, along with the adjacency of convergent and divergent cells, confirm that the dipole signals in measurements are indeed from circulations at the mesoscale.

Fig. 5: Schematic of our SMOC hypothesis.
figure 5

E stands for entrainment rate and \({M}^{{\prime} }\) for shallow convective mass flux anomaly. The blue and brown hues represent moisture anomalies. The streamline shows the sense of the envisioned circulation. The aspect ratio of the advected SMOC at the top is shown to scale, underscoring the shallowness of the circulations. For depiction, it is assumed that conditions remain steady during the advection.

Most SMOC objects are elongated rather than circular, as indicated by the offset between the major and minor axes length distributions in Fig. 4c. Figure 4d shows that the elongation tends to align in the zonal direction, but there is little indication that SMOCs are concentrated along the direction of the near-surface (or cloud base) zonal wind.

Moisture variance and maintenance of SMOCs

SMOCs co-vary with the mesoscale moisture fields. Figure 2b,c shows that subcloud convergence is associated with moister subcloud and cloud-base layers. The converse is true for subcloud divergence. For flight-day averages, the strongest anti-correlation in the vertical occurs at 670 m (r = −0.67). To test whether SMOCs contribute to or are caused by such mesoscale variability, we investigate time-lag correlations between \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) and specific humidity (q). The strongest anti-correlation occurs in the cloud-base layer at 0 h (Fig. 3f), whereas the strongest response of qsc occurs 2–3 h later. The strengthening of the anti-correlation between \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) and qsc with time suggests the direction of causality, that is SMOCs amplify subcloud moisture variance.

Here we develop a hypothesis of how SMOCs amplify the bottom-heavy moisture fluctuations (see bottom schematic in Fig. 5). In the rising branches, subcloud convergence increases the shallow-convective mass flux into the cloud-base layer6,24, which moistens cloud base. The moistened cloud-base reduces the drying efficiency of entrainment, a term representing small-scale mixing of dry air at cloud-base into the subcloud layer. Albright et al.25 show that, while entrainment is the dominant term balancing surface fluxes in the subcloud mass budget, the modulation of entrainment drying primarily results from moisture variability above the subcloud layer. Hence, with a moister cloud-base layer, the drying of the subcloud layer by entrainment becomes less efficient, thereby allowing surface moisture fluxes to accumulate moisture in the layer. The argument applies conversely for the descending branch. This process would lead to an accumulation of moisture in the subcloud layer of the ascending branch, and a corresponding moisture deficit in the descending branch. This bottom heaviness is consistent with observations (Fig. 3c,f). Our hypothesis for the bottom-heavy moisture variance comes with two inferences. Firstly, the mechanism is self-limiting. With time, the response of surface fluxes to moisture accumulation in the ascending branch and moisture deficit in the descending branch will oppose the development of horizontal moisture gradient. This negative feedback potentially sets a limit to how large the mesoscale moisture variance can be. Secondly, the subcloud moisture responds to changes in entrainment drying efficiency via prolonged accumulation or deficit of moisture. This means that SMOCs’ capacity to influence subcloud moisture has a time dependence, that is, the bottom-heaviness is not an instantaneous response to SMOCs. This is consistent with the anti-correlations in Fig. 2b being stronger over flight-day means (~7–8 h) than over circling-set means (~3 h).

A maintenance of moist and dry branches in circulations will result in horizontal gradients of buoyancy and radiative cooling. Let us assume the lower and upper quartiles in \({q}^{{\prime} }\) and net longwave heating rate, \({Q}_{{{{\rm{LW}}}}}^{{\prime} }\) (Fig. 3c,d), represent the spatial differences between ascending and descending branches. The ascending branch (Q1) shows larger radiative cooling in the subcloud layer, which is opposite to what is expected from a circulation driven by radiative cooling differences26. Differences in shortwave heating between the composites are negligible (not shown). SMOCs are thus not driven by differential radiative cooling, at least during EUREC4A . One potential driver for circulations though is the buoyancy gradient arising from the moisture difference27. Although the time-lag analysis suggests that buoyancy gradients do not trigger circulations, they probably amplify or maintain SMOCs. While studies suggest differences in both radiative cooling26,28,29 and moisture-induced buoyancy27,29,30 as possible causes for shallow circulations, at the scales observed in our data, it seems like the former inhibits SMOCs and the latter maintains or amplifies them.

A natural question then is how SMOCs arise. Janssens et al.12, based on minimal-physics LES, argue that they are triggered by shallow convection’s intrinsic property to create unstable scale growth in mesoscale moisture fields. Our findings of SMOCs being ubiquitous also in nature lends strength to their argument that SMOCs are indeed a signature of an intrinsic instability of the tropical atmosphere. However, in contrast to the bottom-heavy moisture variance associated with SMOCs in EUREC4A data, LES show largest moisture variance near cloud-top and negligible variance in the subcloud layer10,11,12. In LES, the circulation–moisture interplay forms a positive feedback10,12. Spatial differences in condensation lead to mesoscale latent-heating anomalies in the cloud layer that, under the weak temperature gradient approximation, create balancing mesoscale vertical motions. The resulting circulation amplifies itself by reinforcing existing condensation anomalies. Although this mechanism explains the top-heavy variance, it is unclear whether such arguments would also be consistent with the bottom-heavy moisture variance associated with SMOCs in EUREC4A data. While SMOCs may be triggered by condensation-driven heating anomalies, their strength and associated moisture variance may be modulated by factors such as precipitation10,31,32, radiative cooling differences26,33 and sea-surface temperature gradients34.

Implications for model representation of mesoscale processes

EUREC4A measurements provide observational evidence for the prevalence of SMOCs in the North Atlantic trades and their influence on mesoscale moisture variance. Specifically:

  • Measurements show an anti-correlation between divergence in the subcloud and cloud layers. We interpret this dipole as being indicative of shallow overturning circulations.

  • The EUREC4A measurements allow us to assess that the low-level divergence in ERA5 are representative of the measurements, even if the measurements are not being assimilated.

  • With ERA5, we show that SMOCs are usually elongated features of ~100–200 km and are ubiquituous (covering on average 58% of a 10° × 10° domain), thus explaining the large variability observed in mesoscale vertical velocity.

  • Subcloud convergence is correlated with moister subcloud and cloud-base layers, indicating a bottom-heavy moisture variance. By affecting the efficiency of entrainment drying, SMOCs probably amplify moisture variance by extending the moisture fluctuations at cloud base down to the subcloud layer.

  • Convergent subcloud layers are 0.7 g kg−1 moister and radiate energy at rates that lead to 0.3 K per day larger longwave cooling rates than divergent subcloud layers, indicating that SMOCs are unlikely to be driven by radiative anomalies. We believe that a moisture perturbation-induced buoyancy gradient in the subcloud layer could potentially maintain or amplify the SMOCs.

The ubiquity of SMOCs in EUREC4A observations and their coupling to mesoscale moisture fluctuations (and cloudiness5,6) indicate the mesoscale’s control on how clouds respond to climate change. The scale of the dominant energy in SMOCs is comparable to the grid scale of current climate models (~100 km) (ref. 35) and, if represented in these models, will probably be aliased to much larger scales. That some models do not represent the moisture-induced buoyancy effect36 suggests that they might lack an important control of circulations and moisture variability on the mesoscale. Therefore, exploring the instabilities and competing factors that drive SMOCs and the associated moisture fluctuations will improve our understanding of processes controlling cloud amount and organization. In this regard, differences between models and measurements (such as those in moisture variance) merit further investigation, something aided by our demonstration of the re-analyses’ ability to represent such circulations. Such investigations are further motivated by Vogel et al.6, who show with EUREC4A observations that the variability in mesoscale vertical velocities, which we attribute to SMOCs, substantially controls variability of shallow cumulus cloud amount.


EUREC4A dropsonde measurements

The field campaign EUREC4A took place in January–February 2020 over the tropical North Atlantic upwind of Barbados (see campaign overview in ref. 14). A core observation of EUREC4A was area-averaged horizontal mass divergence and vertical velocity profiles derived from dropsonde measurements along the circumference of a circular flight path37. In EUREC4A, the circular flight path was fixed to facilitate statistical sampling, with the centre at 57.67° W, 13.31° N and a diameter of 222.82 km (hereafter called EUREC4A circles), and flown by the German High Altitude and Long range (HALO) aircraft. To keep the sampling consistent, here we exclude HALO’s first (19 January 2020) and final (15 February 2020) research flights of the campaign and use data from 65 circles flown over the remaining 11 research flights, with a typical flight including 6 circles. Each circle typically launched 12 dropsondes spaced equally along the circumference over a period of an hour. On most flight days, HALO flew two sets of three circles each, called circling sets, with an excursion in between aimed at sampling upwind conditions. The two circling sets of a flight were carried out over a period of 7–8 h, here termed as a flight day. An overview of the circles flown during EUREC4A and the dropsondes therein is provided in ref. 16.

The dataset ‘Joint dropsonde Observations of the Atmosphere in tropical North atlaNtic mesoscale Environments’, with the backronym JOANNE16, provides measurements from the EUREC4A dropsondes. We use level-4 data of JOANNE, which provides the area-averaged quantities at 10 m vertical spacing from the circle measurements, such as horizontal mass divergence (\({{{\mathcal{D}}}}\)) and specific humidity (q). The measured quantities are from the surface up to 9.5 km, which was the typical flight altitude during the circles. From the dataset provided by ref. 38, we use the net radiative cooling rates, with circle values obtained by averaging over sondes in the circle.

Throughout the study, we use the terms subcloud layer, cloud-base layer and cloud layer (referred to as ‘sc’, ‘cb’ and ‘c’ subscripts, respectively) to indicate altitude intervals of 0–600 m, 600–900 m and 900–1,500 m from the surface, respectively (also indicated in Fig. 1c). We define the cloud-base layer as an extended transition layer between the subcloud and cloud layers to account for thermodynamic variability that is most tightly coupled to that within the subcloud layer39. We explored but found little benefit in trying to adapt these altitude intervals based on the specific structure of the trade-wind layer for any given day (also see ref. 40). The prime symbol is used to indicate the anomaly from campaign mean. For example, \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) is the divergence anomaly from time mean, averaged over the subcloud layer.

ERA5 divergence and comparison with EUREC4A

We use \({{{\mathcal{D}}}}\) from ERA5 re-analysis products for time period between 20 January 2020 00:00 UTC and 21 February 2020 00:00 UTC (parameter ID 155) available at 0.25° and 1 h intervals. First, we check the reliability of ERA5 divergence, by comparing it with the circle observations. To make a comparison collocated in space–time, we average ERA5 divergence spatially over gridboxes included within the standard-circle area for the hourly timestep nearest to the mean time of each circle from observations. Extended Data Fig. 1 shows the agreement between these divergence profiles from ERA5 and the corresponding ones from JOANNE averaged for every flight day. Whereas the profiles shown are averages over the flight day, the estimate of r values in the figure are from values from all individual profiles in that day. Thus, the re-analysis’ agreement of divergence with observations is also at the circle timescale (1 h) and not just when averaged over the flight day (6–7 h). The vertical structure of divergence simulated by ERA5 is the same as that seen in the circle observations for most days, thus lending confidence in the use of re-analysis fields to study the spatial and temporal variability in divergence.

The ERA5 products have assimilated information from the EUREC4A dropsondes and radiosondes. To check the influence of assimilation, we check the difference in divergence simulated by data-denial experiments. These experiments are the same as those described by ref. 41, where a control simulation (‘ctrl’) similar to the ERA5 operational product is run along with two data-denial experiments—one with no EUREC4A dropsondes (‘nd’) and the other with no EUREC4A dropsondes and radiosondes (‘ndr’) assimilated. We compare profiles between JOANNE and the experiments when the timestamps are within an hour of each other. The experiments have outputs available at 6 h intervals, and therefore, we have only 15 instances when \({{{\mathcal{D}}}}\) can be compared with JOANNE. Extended Data Fig. 3 shows the square root of the mean squared error between \({{{\mathcal{D}}}}\) in the three experiments and \({{{\mathcal{D}}}}\) in JOANNE (RMSE\({}_{{{{\mathcal{D}}}}}\)). The assimilation results in very little improvement in the simulated fields of divergence. A similar conclusion was drawn by ref. 41 for horizontal wind in the lowest 2 km. We believe that assimilation of near-surface horizontal winds from satellite-based scatterometers constrains the ERA5 near-surface divergence over ocean, making it possible to get an accurate vertical structure of \({{{\mathcal{D}}}}\). The small impact of the soundings’ assimilation of soundings is explained more generally by ref. 42 as ‘what often happens when one observing system is withdrawn from the data assimilation system is that other observing systems compensate for its loss and play a bigger role in constraining the analysis.’

The agreement between ERA5 and JOANNE is poorest for the flights on 22 January, 13 February and 11 February. The performance of ERA5 in reproducing \({{{\mathcal{D}}}}\) suffers most when it does not reproduce the horizontal winds (U) accurately (Extended Data Fig. 4). Whereas the association between the two is obvious, it raises the general question if there are particular conditions when ERA5 poorly reproduces U and hence, \({{{\mathcal{D}}}}\) . Measurements from the ATR-42 aircraft during EUREC4A showed flights on 11 February and 13 February to be the rainiest among all ATR-42 flights (see Table 4 in ref. 43). As mentioned earlier, these two days are among the three poorest for ERA5’s agreement with observations (Extended Data Figs. 1 and 4). For the last among the poorest days (22 January), the ATR-42 did not have a flight, but the region was dominated by a large fish structure (one of the four canonical cloud-organization patterns of ref. 44), which is known to produce large rain amounts45. Scatterometer measurements are known to suffer under rainy conditions, and therefore, we believe that this could be why ERA5 performs poorly on these days. However, a more robust investigation is needed to test our speculation. Attempts are underway to understand the performance of ERA5 in reproducing surface divergence patterns by comparing them with satellite observations.

Segmenting SMOC objects

To detect SMOC objects in the ERA5 \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}\) field, we introduce a crude measure to detect which gridboxes can be included as being part of SMOC objects. All gridboxes that have opposite signs of \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) and \({{{{\mathcal{D}}}}}_{{{{\rm{c}}}}}^{{\prime} }\) are considered SMOC cells (Fig. 4a and Extended Data Fig. 2). Such cells are further classified as either convergent cells if \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) < 0 or divergent if \({{{{\mathcal{D}}}}}_{{{{\rm{sc}}}}}^{{\prime} }\) > 0. Furthermore, the domain is segmented into multiple clusters of convergent and divergent cells based on a neighbour-identifying scheme where up to two orthogonal hops are made to consider a gridbox as a neighbour, or what is also known as a Queen’s contiguity case in spatial autocorrelation analysis46 (Fig. 4b). The statistics of SMOC objects are rather insensitive to whether a neighbour-identifying scheme of two orthogonal hops or one orthogonal hop is chosen. We use the label function from the measure module of Python’s scikit-image package (v0.19.2)47 to perform this.

To get an estimation of the horizontal scale of these clusters, we estimate their major and minor axes, if they were fitted to an ellipse. Thus, the major and minor axes are defined as the larger and smaller second moments of area of these clusters, respectively. The first moment of area provides the coordinates for the centroids of clusters shown in Fig. 4b. The effective diameter (deff) of the clusters is the diameter of a circle equivalent in area to the area of the cluster. To avoid irregularities due to the coarse resolution of the ERA5 domain, we only consider clusters with major axis length greater than 0.75° as SMOC objects.