Introduction

Gravity waves (GWs) are a common phenomenon in the stratosphere1 and convection is common in the troposphere. The stratospheric gravity waves (SGWs) induced by convection are critical for both small- and large-scale atmospheric phenomena. At the small scale, SGWs associated with updraft and downdraft can result in turbulence. SGWs are one of the main sources of the turbulence that propagates away from convection sites2,3. With the development of the aerospace industry, especially near-space aerospace, turbulence induced by SGWs increases the risk for stratospheric aviation. Therefore, it is important to investigate the SGWs induced by convection to avoid disasters caused by turbulence. At the large scale, SGWs are an essential driver of atmospheric circulation1. Accurate simulation of SGWs is critical for the simulation of atmospheric circulation in general circulation models (GCMs). However, SGWs have a wide spectrum of horizontal wavelengths ranging from a few kilometers to thousands of kilometers4,5 therefore, part of the spectrum cannot be resolved by GCMs owing to the limitations of model resolution, as reported in Amiramjadi et al.6 (hereafter A23). Accordingly, GW parameterizations are proposed in GCMs7,8.

Parameterization schemes for SGWs are highly simplified because of the lack of a comprehensive understanding of the generation mechanism and distribution characteristics of the SGWs8. In present-day models, parameterizations of non-orographic SGWs are generally specified as a constant mean source and the differences in wave sources are ignored8. In the real atmosphere, there are many types of wave sources such as fronts9, jet streams10, and tropical cyclones (TCs)11, all of which are tropospheric weather systems. Parameterizations in GCMs that ignore the differences between wave sources do not provide an accurate simulation because of the different distributions and mechanisms of SGWs induced by different sources12.

TCs are among the most destructive weather systems on the earth13,14,15, and they are significant sources of SGWs16,17,18. When TCs occur over the ocean, the SGW intensity (SGWI, hereafter) can increase considerably18; this is the so-called intermittency of SGWs19. A TC can induce SGWs through convection in the eyewall and rainband and the wind shear around the outflow layer20. A complex SGW wave source can generate a specific pattern for SGWs induced by a TC (hereafter TC-SGWs) through different mechanisms. Wang et al.20 used a numerical model to investigate the pattern of SGWs induced by a TC in the absence of background wind. They found a two-peak pattern of SGW intensity (SGWI) along the radial direction.

With global warming, the frequency of TCs is predicted to decrease21, whereas the potential intensity of the TCs may increase22. The TC-SGWI is related to TC intensity23, whereas TC-SGWs can help intensify both a TC during its development stage24 and the convection around the TC25. These processes increase the difficulty in forecasting TC intensity, which suggests that an accurate simulation of TC-SGWs is important for predicting TC intensity in a model.

Previous studies on the characteristics of TC-SGWs have investigated one or more TC cases using a numerical model26,27,28,29 and observations23,30,31,32. Nolan and Zhang23 used aircraft observations and surface instruments to show that the TC-SGWs radiate outward as tight spirals. Wang et al.33,34 used the Weather Research and Forecasting (WRF) 4.2 Model to investigate the SGWs induced by TC Lekima (2019). They found that intense SGWs are located mainly over the north of the TC because of the superposition of the SGWs induced by the TC and a trough. However, the scope of the simulation is limited, and the results may be case-dependent. Wright18 used 15 years of satellite observations to investigate TC-SGWs and found the TC-SGW momentum flux varies significantly at short distances from the TC center while it decreases with radial distance at distances greater than 600 km from the TC center. Although the radial pattern of the TC-SGWs has been established, the horizontal asymmetric pattern of the TC-SGWs remains unclear. Previous studies have considered only one or a few TC cases rather than the climatology of the TC-SGWs, so their conclusions are not robust and are case-dependent.

The climatology of SGWs has been investigated. Qu et al.35,36 used satellite observations to investigate SGWs in the Asian monsoon region, but their study used only 2 years of observations from December 2019 to November 2021 and the results are not robust. More than 30 years of data are needed to reveal the climatology of SGWs. Furthermore, their study focused on a local region, with little investigation of moving TCs.

Wei et al.37 used three kinds of data from the European Center for Medium-Range Weather Forecasting (ECMWF), including the 9-km and 18-km Integrated Forecasting System (IFS) data and the 30-km ECMWF Reanalysis version 5 (ERA5) data38 to investigate the global distributions of SGWs and found that ERA5 can describe the SGWs. Cullens et al.39 have used the Cloud Imaging and Particle Size (CIPS), Atmospheric Infrared Sounder (AIRS), ECMWF-IFS and ERA5 data to carry out the study of the SGWs generated from TC Yutu (2022). They showed some snapshots of concentric TC-SGWs observations (Fig. 2 in their works) and conclude that the 9-km ECMWF-IFS and ERA5 data can both reproduce the TC-SGWs at the same location as the AIRS and CIPS observations. Generally, the distribution pattern of SGWs obtained using ERA5 is similar to that of the fine resolution IFS data, although the intensity may be weaker. Compared with the IFS data, it is relatively easy to obtain ERA5 data with a sufficiently long-time span. Pahlavan et al.40 used 10 years of ERA5 data to investigate the characteristics of tropical convective GWs. A machine learning algorithm based on ERA5 data was used by A23 to investigate SGWs and their influencing factors. They found that the background wind is more important for the SGWs than the sources, but they did not provide physical insights. Note that the background wind can modify not only the pattern and intensity of TC-SGWs41 but also TC movement and intensity42, so the patterns and driving mechanisms of TC-SGWs are complex. The climatology of the SGWs is related not only to the TC specific structure but also to the background wind variation. However, these characteristics are simplified in parameterizations owing to the limitations of observations, dynamic theory, and computational resources. Therefore, it is necessary to investigate the specific pattern and mechanisms of TC-SGWs.

Long-term ERA5 data can be used to reveal the climatological characteristics of SGWs with a longer horizontal wavelength39,40,43. However, within the scope of our research, there have been no studies on the climatology of TC-SGWs using reanalysis data. Therefore, the climatology of the TC-SGW pattern and its causes require further study.

We studied TC-SGWs that formed over the northwest Pacific Ocean using ERA5 data from 1991 to 2020 and the China Meteorological Administration (CMA) Shanghai Typhoon Institute (STI) Best Track Dataset (CMA-STI Best Track Dataset)44,45. Our results provide a preliminary guide for GW parameterizations in GCMs, TC intensity forecasting, and TC-SGW research interests in certain regions relative to the TC. They also provide a preliminary guide for stratospheric aviation around a TC to avoid the turbulence induced by the TC-SGWs. The relative contributions of source and environmental fields (background wind) to the TC-SGWs are further investigated using artificial intelligence algorithms. A physical analysis of the influence factors is also given.

Results

Features of the TC-SGW pattern

Composite analysis of the SGWs at 30 hPa (~23 km) was carried out for TCs formed over the northwest Pacific Ocean during 1991–2020. The TC-SGWI is defined by the absolute value of the band-passed vertical velocity (see “Methods”).

The distribution pattern of TC-SGWs is shown in Fig. 1. The azimuths shown in Fig. 1 are measured counterclockwise from 0° to 360°, with east of the TC center at 0° and north of the TC center at 90°. The distribution pattern at other altitudes in the lower stratosphere is similar (not shown).

Fig. 1: Mean pattern of the TC-SGWI at 30 hPa and background wind averaged over r < 1600 km around the TC center.
figure 1

a TC-SGWI (10−3 m s−1). b Absolute momentum flux (10–4 Pa). c Zonal momentum flux (10−4 Pa). d Meridional momentum flux (10−4 Pa). e Vertical profiles of the zonal (red line) and meridional (blue line) background wind (m s−1).

The TC-SGWI is highest in the azimuth band from 270° to 10° (Fig. 1a). Because the drag of SGWs on the circulation is calculated from the momentum flux, the momentum flux of SGW is a critical variable for GCMs. Composite analyses of the wave momentum flux are shown in Fig. 1b–d. The pattern of MF (Fig. 1b) is similar to that of TC-SGWI. The eastward zonal momentum flux (MFu, Fig. 1c) is also co-located with the MF and the TC-SGWI, which indicates that the TC-SGWs mainly carry zonal momentum flux. The northward (southward) meridional momentum flux (MFv, Fig. 1d) is located to the north (south) of the TC center, but the MFv is one order weaker than the MFu, which is in agreement with Wei et al.37 Therefore, we focus on the TC-SGWI in the following discussion of TC-SGW characteristics.

The traditional viewpoint is that SGWs are closely related to the wave sources6. To reveal the relationship between TC intensity and the TC-SGWI, a composite analysis based on TC intensity category (Supplementary Table 1)44 is shown in Fig. 2. The TC-SGWs are mainly found in the azimuth range from 270° to 10°, which is consistent with the pattern in Fig. 1a. The composite analyses of MFu, MFv and MF are similar (Supplementary Fig. 13).

Fig. 2: Composite analysis of the TC-SGWI (10−3 m s−1).
figure 2

TC-SGWI for TC intensity category (a) TD, (b) TS, (c) STS, (d0 TY, (e) STY, and (f) SuperTY. Dotted areas have passed the 99% significance test.

As the TC intensifies, the TC-SGWI (especially that in the southeastern parts of the TC) intensifies significantly (Fig. 2a–f), indicating that the TC-SGWI depends on TC intensity. The maximum TC-SGWI at TD (Fig. 2a) is ~9.5 × 10−3 m s−1, and that at STY and SuperTY is ~11 × 10−3 m s−1, which is an increase of 16%. Furthermore, the areas of significant TC-SGWs have also expanded.

This result is unsurprising because the TC-SGWs are induced by the TC and have a robust relationship with the TC intensity20. However, the pattern of the TC-SGWs varies little with TC intensity, which suggests that the pattern of the TC-SGWs is not determined by TC intensity.

Both the intensity and position of the TC can change, and they are related. The intensity and track of the TCs during 1991–2020 are shown in Fig. 3. Most TCs form north of 5°N (Fig. 3b) and then move northward and westward, accompanied by an increase in their intensity. Most TCs reach their maximum intensity at about 15°N–30°N. When they move north of ~30°N, the intensity of most TCs decreases.

Fig. 3: Background wind at 30 hPa and TC track.
figure 3

a Meridional shear of the background wind (m s−1 degree−1). b Background wind (arrows), TC track (lines). Line colors show the intensity of the TC.

The northward movement occurs across a wide range of latitudes, and the background circulation can change over such a large region. Therefore, the background wind at 30 hPa is also shown in Fig. 3; the background wind is easterly and varies with latitude (Fig. 3a, b). Accordingly, we now focus on the meridional movement of the TC. As the TCs move northward, the easterly wind strengthens, reaching a maximum at the tropical easterly jet at ~15°N (Fig. 3a). The background easterly wind then weakens northward until it becomes westerly at ~45°N. The meridional shear of the zonal wind is negative south of 15°N and positive from ~15°N to ~55°N, although most TCs do not reach this latitude. The most intense positive meridional shear of zonal wind is found at 30°N. The small number of TCs that move north of ~40°N into the no zonal wind or weak westerly wind zones, where both the positive meridional shear of the zonal wind and the intensity of the TCs decrease.

In general, both the background wind and TC intensity (wave source) that vary with the movement of the TC can influence the TC-SGWs. Therefore, it is important to identify the main factors influencing TC-SGWs.

Influencing factors

To evaluate the factors that influence the TC-SGW distribution pattern objectively, “neural network” (NN, hereafter) artificial intelligence algorithms are utilized to reconstruct the TC-SGWI pattern based on 28 explanatory variables, including the background wind and vertical velocity at different heights and 2-min mean maximum sustained wind of the corresponding TCs (Vmax). Details are given in the Methods section. After adjusting the parameters of the NN (not shown), the relative error of the NN is 30%, close to the 32% obtained in A236 using random forests. The azimuthal variations of the TC-SGWI from both ERA5 and the NN are shown in Fig. 4a, b. The most intense TC-SGWs in both ERA5 and NN are located at ~0° and ~315°, and the TC-SGWI from 45° to 135° is weak in both cases, which is consistent with the southeast pattern in Fig. 1a. The mean TC-SGWI at each azimuth is shown as a blue solid line in Fig. 4a, b. The correlation coefficient between the ERA5 and NN results is 0.81 and passes the 99% significance test. This also indicates that the NN based on 28 explanatory variables can reproduce the climatology pattern of the TC-SGWI. Note that the extreme value of the TC-SGWI in ERA5 may be underestimated by the NN because of the lack of the extreme samples. However, this does not affect the purpose of this paper to reveal the influence factor of the climatology of the TC-SGW pattern.

Fig. 4: Results of the NN and the attribution of the explanatory variables.
figure 4

TC-SGWI (10−3 m s−1) from (a) ERA5, (b0 NN. c Attribution of the explanatory variables in NN. Solid blue lines in (a, b) show the mean variations of TC-SGWI with azimuth.

The attribution of each explanatory variable can be obtained as Fig. 4c by Layer-wise Relevance Propagation (LRP)46. The background wind in the stratosphere (70–30 hPa), especially at 30 hPa, is the most important variable for the TC-SGWI pattern. Next, the TC Vmax is also important for the TC-SGWI because the TC-SGWs are induced mainly by the eyewall20. Because the wind at 1000 hPa is at almost the same altitude as Vmax at 2 m, the wind at 1000 hPa is also slightly significant, showing the influence of the sources. Therefore, the most important variable is the background wind in the lower stratosphere, especially around the altitude of the TC-SGWs (30 hPa in this case). Hereafter, the wind at 30 hPa is chosen as the background wind in the lower stratosphere. Although the NN can select the main influencing factors, it cannot explain the physical mechanisms of the effect. Therefore, the influence of the mechanisms of the background wind should also be investigated.

Mechanisms of background wind influence on the pattern of TC-SGW

The distinct TC-SGWs are located mainly in the southeast quadrant of the TC, which is the upstream side of the easterly background wind in the stratosphere (Fig. 1e), indicating propagation against the background wind11,26,41. Because the TC-SGWs with the same phase speed as the background wind will be filtered out at the critical level47, the TC-SGWs propagating eastward can be retained. The meridional wind is much smaller than the zonal wind (Fig. 1e), so the filtration effect of the meridional wind may not be the main cause of the formation of the TC-SGWI pattern.

The background wind at 30 hPa is the most important background wind (Fig. 4c), and it changes with latitude (Fig. 3a, b). The meridional shear of the background wind is negative south of 15°N and turns positive north of 15°N. The meridional shear increases from north of 15°N to 30°N, where the meridional shear is the maximum. Based on the difference of the background wind, the TC-SGWI is divided into three latitude regions for composite analysis, and the corresponding results are shown in Fig. 5. The composite analysis results for MFu, MFv, and MF are similar to those for TC-SGWI (Supplementary Figs 46). The maximum of the TC-SGWI is located north of the TC center when the TC centers are south of 15°N (Fig. 5a) where there is intense negative meridional shear of the zonal wind (Fig. 3a). When the TC centers are north of 15°N, the maximum of the TC-SGWI is located southeast (Fig. 5b) and even farther south (Fig. 5c) of the TC center. The center of maximum TC-SGWI rotates clockwise with increasing center latitude. There are very few samples when the TC centers are at 30°–40°N (Fig. 5c; only 12.0%), and the TC intensity is much weaker (Fig. 3b). Therefore, the pattern in Fig. 5c is not investigated further here.

Fig. 5: Composite analysis of the TC-SGWI (10−3 m s−1) for different latitudes of the TC center.
figure 5

The latitude of the TC center is at (a) 0°N–15°N, (b) 15°N– 30°N, and (c) 30°N– 40°N. Dotted areas have passed the 99% significance test.

Dunkerton48 and Forbes et al.49 studied the refraction of SGWs by the horizontal wind shear and gave the relation between wind shear and wavenumber as:

$$\frac{dl}{dt}=-k\frac{\partial u}{\partial y},$$
(1)

where k and l are the zonal and meridional wavenumbers of the SGWs, t and y are time and meridional distance, and u is the zonal wind. The easterly wind (red lines in Fig. 1e) can filter out the TC-SGWs propagating westward47,50, so the TC-SGWs propagate eastward with positive zonal wavenumber.

According to Eq. (1), because of the negative meridional shear of the zonal wind south of 15°N, the TC-SGWs propagating eastward are refracted northward with positive meridional wavenumber. North of 15°N, the meridional shear of the zonal wind becomes positive and the TC-SGWs propagating eastward are refracted southward (Fig. 5b). Therefore, although the meridional background wind is weak in Fig. 1e, refraction by the meridional shear of the zonal background wind can change the propagation direction of the TC-SGWs and result in the southeast pattern of TC-SGWs (Fig. 5b) which is consistent with the previous studies51,52.

To further quantitatively show the effects of the refraction of the background wind, we used the mid-frequency approximation, which has been applied by Horinouchi et al.53, to estimate the horizontal phase speed of the TC-SGWs. Here, the horizontal phase speed is about 24–48 m s–1, which is consistent with Horinouchi et al.53 and also the typical phase-speed range of the TC-SGWs. According to Fritts and Alexander1, in mid-frequency approximation, the horizontal group speed has the same magnitude as the phase speed of the TC-SGWs. When the background field only depends on the latitude, the left hand of the Eq. (1) can be approximately simplified as: \(\frac{dl}{dt}={c}_{gy}\frac{\partial l}{\partial y}\approx {c}_{y}\frac{\varDelta l}{\varDelta y}\), and \({c}_{gy}\) is the meridional group speed, which is approximated as the meridional phase speed \({c}_{y}\) (general range of 24–48 m s–1), \(\varDelta y\) is the meridional propagation distance. From Fig. 1a, the range of the meridional propagation of the TC-SGWs is about 400–1600 km. According to the Eq. (1) and the simply form of the left hand of it, the angle between the zonal propagation and the real horizontal propagation caused by the refraction can be estimated as \(\arctan (|\frac{\varDelta y}{{c}_{y}}\frac{\partial u}{\partial y}|)\). For the meridional shear at 22.5°N and 30°N (\(\frac{\partial u}{\partial y}\) = 0.5 m s–1 degree–1and 1 m s–1 degree–1, respectively), the range of the angle is 2.1°–16.7° and 4.3°–31.0°, respectively. From Fig. 1a, the center angle between the zonal propagation and the real horizontal propagation of the intense TC-SGWs is about 22.5°. Therefore, the refraction by the background wind can be an important formation mechanism of the TC-SGW distribution.

The intense TC-SGWs in the composite results (Fig. 1a) are located in the southeastern part of the TC center, similar to the pattern when the TC centers are at 15°N–30°N (Fig. 5b), because the southeast pattern for the TC-SGWI is the most common, as the number of samples and the intensity of TCs with centers in 15°N–30°N are much larger than those in other regions (more than 61.4% of the total samples, see “Methods” section). The number of samples of other patterns like those in Fig. 5a, c is smaller (26.6% and 12.0%, respectively). Therefore, the southeast pattern of the TC-SGWI is closely associated with the background wind.

To further verify this argument, considering that the background wind cannot remain consistent at all times, the variety of background wind should be considered. The composition of the background wind at different seasons can be found at supplementary Fig. 7. TC intensity in different seasons is distinct, so the composition on different seasons is not given. Associated with the tropical stratospheric background wind54, the Quasi-Biennial Oscillation (QBO) is a typical interannual oscillation which can show the time variation of background wind. The time-series of the QBO can be obtained from National Oceanic and Atmospheric Administration (NOAA, see “Data”) and is shown at Supplementary Fig. 8.

The composition of the background wind at negative and positive QBO index (phase) can be found as Fig. 6. TCs at both negative and positive QBO phase can be found from 15°N to 30°N in Fig. 6a, b. The TC intensity at both negative and positive QBO phase are similar to each other as Fig. 6a, b. The mean intensity of the TC at negative and positive QBO phases are 23.08 m s–1 and 22.58 m s–1, respectively, which indicates that the TC intensity should not disturb the investigation on the influence of the background wind on TC-SGWs.

Fig. 6: Composite analysis of 30 hPa background wind, TC track and TC-SGWI at different QBO phases.
figure 6

a Background wind (arrows), the TC tracks (thin lines) at the negative QBO phase. b As in (a), but for the positive QBO phase. The line colors in (a, b) show the intensity of the TC. The black thick solid lines in (a, b) corresponding to the upper x-axis are the meridional shear of the zonal wind (m s–1 degree–1) and the black thick dashed lines in (a, b) are corresponding to the no meridional shear. c TC-SGWI at the negative QBO phase. d As in (c), but for the positive QBO phase. Dotted areas in (c, d) have passed the 99% significance test.

Comparatively, the background wind at negative and positive QBO phase are significantly different. Although there are both easterly wind from 5°N to 30°N, there is weak (intense) negative meridional shear of the background wind at the south of 7°N (20°N) at negative (positive) QBO phase, and there is positive meridional shear of the background wind at the north of 7°N (20°N) at negative (positive) QBO phase.

According to Eq. (1), the different background wind at negative (positive) QBO phase can result in the different TC-SGW distribution. To investigate the TC-SGWs, the composite analysis of the TC-SGWs at different QBO phases are shown as Fig. 6c, d. The pattern of the TC-SGWs at negative QBO phase (Fig. 6c) is similar to the mean pattern of the TC-SGWI (Fig. 1a), with the intense TC-SGWs are mainly at the southeastern part of the TC center. Because the TC-SGWs at negative QBO phase are with easterly wind, they mainly propagate eastward. They are with positive meridional shear as Fig. 6a, considering the Eq. (1), they can be refracted southward.

However, the pattern of the TC-SGWs at positive QBO phase (Fig. 6d) is different from the mean pattern in Fig. 1a, with the intense TC-SGWs are mainly located at the eastern part rather than the southeastern part of the TC center. The TC-SGWs at positive QBO phase are still associated with the easterly wind so they mainly propagate easterly. However, the TCs at positive QBO phase can be found to undergo both negative and positive meridional shear. According to the Eq. (1), the positive (negative) meridional shear can refract the eastward-propagating TC-SGWs into southern (northern) part of the TC center. With both the southward and northward propagation, the TC-SGWs at positive QBO phase only propagate eastward with little meridional shift.

Generally, the southeast pattern of the TC-SGWI is mainly dominated by the background wind especially the easterly wind and the positive meridional shear of the background wind. The pattern can vary with latitude and time (QBO phases).

Discussion

In this paper, the characteristics of SGWs induced by TCs formed over the northwest Pacific Ocean have been investigated using ERA5 reanalysis data from 1991 to 2020, and a climatology of the TC-SGW pattern around the TC center is presented. Intense TC-SGWs with intense MF are located to the southeast of the TC center. These TC-SGWs are associated with intense eastward MFu, with the MFv being an order weaker than the MFu. Composite analysis of TC intensity was used to reveal the sensitivity of the TC-SGWs to TC intensity. The TC-SGWI generally increases with TC intensity, and there is a robust relationship between them. However, the TC-SGWs are located mainly in the southeastern quadrant of the TC center regardless of TC intensity, which indicates that TC intensity does not dominate the TC-SGW pattern. The variation of TC intensity is associated with the variation of the position of the TC, and TCs can move north from the tropics to middle latitudes, so the background wind around the TC-SGWs will change. To further reveal important factors that influence the TC-SGWI pattern, a NN was used to objectively evaluate the effect of the background wind, TC intensity, and convection on the TC-SGWI pattern. The results show that the background wind in the lower stratosphere plays the most important role. Because the TC-SGWs are induced mainly by the eyewall of the TCs20, the TC intensity is also important for the TC-SGWI. Finally, the mechanism for the influence of the background wind on the TC-SGWI pattern was investigated. Filtering by the easterly zonal background wind means that the TC-SGWs are found mainly upstream of the background wind (i.e., to the east of the TC center). However, we also found that the weak meridional wind may not be the main cause of the TC-SGWs lying in the southeastern quadrant of the TC. Instead, the pattern of TC-SGWs is related to the easterly background wind that varies with latitude. When the TC centers are south of 15°N, the TC-SGWs are located in the northeastern quadrant of the TC because the TC-SGWs propagating eastward are refracted northward in the negative background wind shear. When the TC centers lie in the range of 15°N–30°N, the TC-SGWs are located at the southeastern quadrant of the TC because the TC-SGWs propagating eastward are refracted southward in the positive background wind shear. The climatological pattern of the TC-SGWs is the southeast pattern, as the number of TC samples and their intensity at 15°N–30°N are much larger than in other regions. Because the background wind can also vary significantly with time and QBO is an important phenomenon of wind changing with time in tropical regions, we find that the TC-SGWs at the negative QBO phase will be refracted southward because of the intense positive meridional shear of the background wind, while that at the positive QBO phase are mainly located at the eastern part of the TC with little shift, because there are both positive and negative meridional shear of the background wind, which support our argument that the TC-SGW pattern is dominated by the zonal background wind by the filtering and refraction.

In general, TCs have received much research attention because they are strong convection systems in the troposphere, whereas turbulence in the stratosphere caused by TC-SGWs2,3 has been lesser studied because it occurs far from the TC. However, these areas of turbulence can pose a risk for aviation, especially in the stratosphere above the TC. We conclude that there are intense TC-SGWs around the TC that may generate turbulence2,3. The TC-SGWs can also propagate horizontally far from the TC if the background wind is favorable. It is necessary for the stratospheric aviation community to consider not only the TC and the background wind but also the TC-SGWs to avoid the turbulence induced by the TC-SGWs when aircraft pass above a TC. Because the background wind can change the pattern of the TC-SGWs significantly, the background circulation around the aircraft is of key importance.

Previous studies have highlighted the importance of the source for the SGWs in the lower stratosphere. We found that the background wind in the lower stratosphere may be the dominant factor affecting SGWs because of its refractive and filtering effects. With the movement of the TCs, the TC intensity and background wind can both change, which can result in the variation of the TC-SGWs. Thus, TC-SGWs changing with the TCs and background wind should also be considered in the GW parameterizations in GCMs. Because the TC intensity and the background wind refraction can influence the TC-SGWI and TC-SGW pattern, a parameterization scheme that ignores the variation of the wave sources and background wind may be not suitable.

More intense TCs will occur as a result of global warming, as indicated by the results from GCMs22. However, because of the highly simplified GW parameterization that ignores the variations of the TC-SGWs and the influence of TC-SGWs on TC intensity24, the results from the GCMs may not be accurate. In addition, the TC-SGWs are closely related to the background circulation, which may also vary with global warming55. The background wind may influence the TC-SGWs and then the TCs, which may further increase the complexity of the prediction of TC intensity.

Our results also (1) provide a preliminary guide to TC-SGWs for research interests and aviation safety in the context of the TC, (2) highlight the importance of the background wind in the lower stratosphere for TC-SGWs, and (3) reveal the mechanism by which the background wind influences the TC-SGWs.

Methods

Data

The SGWs and the background wind are all obtained from the ECMWF ERA5 dataset. Hourly data can be accessed with a spatial resolution of 0.25° and 137 model vertical levels56, which can resolve part of the spectrum of SGWs reasonably well6. Data for areas of 50° latitude and 170° longitude around the centers of the TCs once the TCs have been numbered are downloaded. There were 859 TCs that formed over the northwest Pacific Ocean from 1991 to 2020. We only use the data at 30 hPa to resolve the SGWs because of space restrictions and the similarity between the SGWs at 30 hPa and other low stratospheric levels.

The explanatory variables in the NN can be obtained from the low-resolution dataset with 2.5° grid spacing. For the explanatory variables, the convection intensity is represented by the vertical velocity at 12 pressure levels from 1000 to 100 hPa in the troposphere, and the background wind is represented by the horizontal wind at 15 pressure levels from 1000 to 30 hPa, where the 70, 50, and 30 hPa levels are in the stratosphere. The intensity of the TC is represented by Vmax obtained from the CMA-STI Best Track Dataset provided by the Shanghai Typhoon Institute. The basin is to the north of the equator and to the west of 180°E from 1991 to 2020, with 6-hourly track and intensity analyses44,45. The QBO index can be obtained from NOAA, defined as the zonal average of the 30hPa zonal wind at the equator.

The training and validation data for the NN also need to be clarified. The input data consist of the explanatory variables including the vertical velocity, background wind, and TC Vmax. We use 75% (25%) of the input data to train (validate) the model. The output data are the gridded TC-SGWI. The number of samples in Fig. 4 is 611,251, which is large enough for the machine learning algorithm.

Method for resolving SGWs from ERA5

An exact calculation of the SGWs (i.e., the intensity of the SGWs, zonal and meridional momentum flux, and absolute momentum flux) requires high-frequency data (on the order of 10 min or less) to allow the spectral analysis to be carried out37,57. However, the limitations of ERA5 mean that we can only obtain the SGWs for each available instantaneous model output. To achieve this, there are four steps.

The ERA5 data around the TC center identified by the CMA-STI Best Track data are first interpolated from 0.25° to 30 km37 by linear interpolation. Using a large enough area ensures that the SGWs can be obtained successfully with satisfactory spectral and spatial resolution58,59. The second step is bandpass filtering by the fast Fourier transform (FFT). As reported by Pahlavan et al.40, perturbations with horizontal wavelength longer than ~400 km can be well resolved by ERA5. Considering that the upper limit of the mesoscale is 2000 km, waves with horizontal wavelength of 400–2000 km can be obtained by multiplying the Fourier coefficients with a response function to produce filtered coefficients, which can avoid the Gibbs phenomenon. The detail response function can be found in Kruse and Smith60. Those waves in the stratosphere should be SGWs37. The above calculation can be simply shown as:

$$A^{\prime} =FFT{\_}B{P}_{400}^{2000}[L{I}_{0.25}^{30}[A]]$$
(2)

where\(A^{\prime}\) is some perturbation variable, \(FFT{\_}B{P}_{400}^{2000}[]\) is the bandpass filtering operator by the FFT with wavelength from 400 km to 2000 km, \(L{I}_{0.25}^{30}[]\) is the linear interpolation operator from 0.25° to 30 km and A is the corresponding variable from ERA5. The vertical velocity from the ERA5 is with respect to pressure (ω), and we need the vertical velocity with respect to height (w). Based on the hydrostatic conditions, the transformation relation is:

$$w\,\approx \frac{-\omega }{\rho g}$$
(3)

Where ρ is the density and g is the gravitational acceleration. The absolute value of the perturbation vertical velocity, which represents the intensity of the TC-SGWs (TC-SGWI)20,25,30,33, is then calculated as \(|w^{\prime} |\).The zonal (MFu), meridional (MFv), and absolute momentum flux (MF) can also be obtained as follows:

$$MFu={\rho }_{0}\overline{u^{\prime} w^{\prime} },$$
(4)
$$MFv={\rho }_{0}\overline{v^{\prime} w^{\prime} },$$
(5)
$$MF={\rho }_{0}\sqrt{{(\overline{u^{\prime} w^{\prime} })}^{2}+{(\overline{v^{\prime} w^{\prime} })}^{2}},$$
(6)

where (u′, v′, w′) is the vector of wind perturbations from formula (2), and \({\rho }_{0}\) is the background density. Finally, the intensity and the momentum flux of the SGWs are interpolated onto polar coordinates with radial resolution of 40 km and azimuthal resolution of 5°. The explanatory variables are also interpolated onto polar coordinates, but the resolution is 400 km and 45°, and data with horizontal wavelength smaller than 2000 km are filtered out to obtain the explanatory variables.

To evaluate ERA5 the reproducibility of TC-SGWs, the radial distribution of the azimuth-mean of the TC-SGWI and absolute momentum flux (MF) is calculated and shown as the supplementary Fig. 9. The radial distribution of the TC-SGWs and MF from ERA5 is consistent well with Fig. 10 in Wright18 and his result is from the 15 years of TC-SGW observations by High Resolution Dynamics Limb Sounder. This means that the ERA5 data can be used to study the TC-SGWs.

Composite analysis

The composite analysis in the main text is carried out on the TC-SGWs including all samples and is based on both the intensity of the TC and the latitude of the TC center. The total number of samples in Fig. 1 is 25,344 (i.e., from 1991 to 2020 every 6 h during each TC lifetime). The numbers of samples in Fig. 2a–f are 5432, 5148, 3923, 3578, 2056, and 912, respectively. The numbers of samples in Fig. 5a–c are 5784, 13,327, and 2585, respectively.

The sample of the composite analysis based on the QBO phase is based on the mean TC-SGWs over the whole TC lifetime to consist with the traces in Fig. 6a, b. The total number of TC is 859 as described in the Data section, and the numbers of samples in Fig. 6a, c, 6b, d are 380 and 479, respectively.

Student’s t test

Student’s t test is utilized to determine the statistical significance level. The t is defined as:

$$t=r\sqrt{\frac{n-2}{1-{r}^{2}}},$$
(7)

where n is the sample size, and r is Pearson’s correlation coefficient, which is defined as follows:

$$r=\frac{\mathop{\sum }\limits_{i=1}^{n}({x}_{i}-\overline{x})({y}_{i}-\overline{y})}{\sqrt{\mathop{\sum }\limits_{i=1}^{n}{({x}_{i}-\overline{x})}^{2}}\sqrt{\mathop{\sum }\limits_{i=1}^{n}{({y}_{i}-\overline{y})}^{2}}},$$
(8)

where \(\overline{x}\) and \(\overline{y}\) are the sample mean values of\({x}_{i}\) and\({y}_{i}\), respectively.

This is a test for the null hypothesis that two independent samples have identical average (expected) values. We hypothesize that each class in the composite analysis has the identical average values over the composite samples as the mean distribution of the TC-SGWs in Fig. 1a. If the result of the test is significant, we consider that the corresponding class of the composite analysis has the different average values as the mean distribution of the TC-SGWs in Fig. 1.

“Neural networks” algorithm

The “neural network” algorithm61, hereafter NN for reference, has been widely used in the meteorological field46,62 and in studies of SGWs63,64. The NN is appropriate for large data volumes and can identify the nonlinear relationships among the data. The selection of the input variables is not important for the NN because it can discard unnecessary variables through automatic learning. After adjustment, the NN in our investigation is a backpropagation neural network with four hidden layers with 300, 600, 300, and 100, nodes. The activation function is the Rectified Linear Unit (ReLU). The NN is optimized by Adaptive Moment Estimation (Adam). The learning rate of stochastic gradient descent, which is a parameter determining how the node weights are optimized, is between 0.1 and 10−10. The loss function is the mean absolute percentage error for convenience of comparison with the results from A23.

However, the disadvantage of the NN is that the interpretability of the results is weak. To build an explainable NN, so-called “attribution methods” are used. Layer-wise Relevance Propagation (LRP) is one of the attribution methods, which has been utilized to evaluate the attribution of the explanatory variables for convection46. LRP decomposes output values into contributions from each individual input. LRP-p is an improved version of LRP that evaluates the attribution of the input variables more objectively46.