Introduction

Sea salt contributes to the major fraction of aerosol population over the marine boundary layer and it is usually generated over the oceans via winds. These momentous component of aerosol particles are predominantly produced through wave break-action which entrains air into the sub surface ocean depth that rise up back to the ocean surface as bubbles causing white caps and burst1, thus injecting seawater drops into the atmosphere. Wind speed plays a crucial part in triggering the waves whereas wind direction and turbulent mixing helps in entraining the sea salt spray into the atmosphere. Also, the rate of interfacial sea salt (sea salt generated at ocean surface) and effective sea salt (sea salt that sustains in the atmosphere for longer period of time) production is essentially governed by conditions such as humidity, wind, and dry depositional rates2. The net loss of the sea salt is mostly through gravitational settling and wet deposition in the coarse mode regime3; where it tends to fall down under the force of gravity back to the ocean. Therefore, small drops/fine mode particles efficiently entrain vertically into the atmosphere.

Sea salt also scatters the solar radiation because of strong refractive index and can attain near to unity single scattering albedo4. On the other hand, it can also act as an efficient hygroscopic cloud condensation nuclei that elevate the water uptake to form clouds and can affect net radiative balance via radiative cooling5,6,7,8. The quantification of sea salt (SS) and non-sea salt (NSS) aerosol source contribution in precipitation over land and ocean is important to understand their complex intermingle chemistry9,10,11, and their pathways for dominant aerosol regimes that may affect pH of rainwater12,13,14 and subsequent quantification of natural and anthropogenic sources of sea and non-sea originating aerosol species.

Summer monsoon or South West (SW) monsoon brings on an average 800–1000 mm of orographic rainfall15 over the Western Ghats accompanied by warm low level maritime cloud that remains of prime time to understand the loading of water-soluble inorganic aerosol species in rainwater SS and NSS source apportionments. Seawater fractions have been used to estimate the SS and NSS sources of soluble inorganic species dissolved in rainwater over the past decades16,17. However, the methodology suffers from major limitations such as: firstly no fractionation/loss occurs while the formation, injection and transport of sea salt aerosols of cloud bearing species; and secondly all reference species are of sea salt origin18. But, as the monsoon clouds engulf over the orographic region of Western Ghats, the pollution from windward cities presides the fate of rainwater aerosol species as one may expect the signatures of sea salt to be lost in the course of time and path due to wash out or aqueous phase reactions. Therefore, this may tend to underestimate/overestimate the actual amount of NSS species in rainwater concentrations. In the present study, we utilize k-mean clustering algorithm19,20,21 imposed over hourly HYSPLIT (HYbrid Single-Particle Lagrangian Integrated Trajectory) backward (5 days) trajectories to separate discrete air masses originated during SW-monsoon period over the Arabian Sea reaching at the receptor site, Mahabaleshwar (High Altitude Cloud Physics Laboratory-HACPL 17.92° N, 73.66° E) at an altitude of 1375 m above mean sea level definite into two clusters to segregate as NSS and SS spells respectively. This cluster classification is applied to daily 113 rainwater samples during 2016 monsoon rainfall that is further based on greater than 60 percent threshold trajectories grouped into a specific cluster for a day.

Results & Discussions

Surface meteorological conditions over the Arabian Sea

Figure 1(a–e) shows daily time series of latitudinally aligned in-situ OMNI buoy surface observation of meteorological parameters such as wind speed and direction, temperature, pressure and relative humidity over the Arabian Sea. AD09 buoy is situated in the south, AD07 in the north and AD08 in the intermediate coastal region as shown in Fig. 1f. It is inevitably seen that during the initial phase of monsoon, the latitudinal gradient of north-south surface pressure is prominent near to ~2 hPa. This shift in low pressure is well reflected by relatively stronger winds observed by the AD07 buoy. The near difference in latitudinal wind speed between AD07 to AD09 is ~1.5 m/s. And this stronger wind and cloudy regime is further found to cool the surface air and supports the viable change observed near to surface 2 m temperature (~1 °C between north-south gradient) across the Arabian Sea. The shift in prevailing wind direction during monsoon from south-west to north-east direction is well known. However, the latitudinal north-south shift in wind direction at the surface is quite unique and substantial difference is observed (~30–40°) between AD07 and AD09. This rarity is important in order to understand the long-range transport at coastal sites during rainwater collection.

Figure 1
figure 1

Daily mean time series of ocean surface meteorological parameters (wind speed- m/s, direction- deg°, pressure- hPa, ambient temperature- °C, and RH- %;) from OMNI Buoy (AD07, AD08, and AD09) dataset (ae). The shaded region in blue, red and white represents cluster-1, cluster-2 and missing dates. Spatial map (f) of HYSPLIT clustered backward trajectories at Mahabaleshwar receptor site (MBLR), mapped over ASCAT ocean surface wind stress (Pa). And local AWS clustered wind rose diagram (g) at site location HACPL- Mahabaleshwar during summer monsoon 2016.

The wind speed at the ocean surface is one of the major variables for generating the sea salt aerosols. On the other hand, wind direction is an equally important factor that controls the advection of these particles. Figure 1f shows averaged ASCAT satellite - wind stress (Pa-Pascal) over the Arabian Sea at 25 × 25 km spatial resolution during summer monsoon of 2016. The wind stress signifies the roughness of the ocean surface for generation of waves, which may lead to the initiation of sea salt spray. The maximum wind stress of ~0.3 Pa is well seen near to the Somali coast. However, gradient of wind stress was observed between AD07 to AD09 buoy that varied between ~0.25 to 0.1 Pa. This indicated strong wind force over the ocean surface for generation of ocean waves in northern region as compared to southern Arabian Sea, which may play a major role in sea spray generation. The areal proximity of Mahabaleshwar (MBLR) is closer to AD07, hence the prevailing conditions of sea salt generation at ocean surface near to AD07 should resonate in MBLR. However, during active and break phases of monsoon, the wind speed and directions highly fluctuate and may be of oceanic and continental origin. Hence, k-mean cluster algorithm is implemented to separate the wind variability into 2 clusters as shown in Fig. 1f. Cluster 1, is designated as the one most influenced by continental air masses with northerly components of slow winds (also see in Suppl. Fig. 1) whereas cluster 2 purely reflected dominance of oceanic monsoonal south westerly flows. This is also evident from Fig. 1g, as shown in the wind rose diagram from local Automated Weather Station (AWS) at MBLR. The cluster 1 wind rose reveals that the significant amount of winds reaching at site from north west-eastward direction with magnitude mostly constrained to below ~2–3 m/s. Whereas, cluster 2 mostly displayed strong south-westerly flows reaching up to ~4–7 m/s. This is also well validated by comparing the normalized probability and PDF(probability density function) of both the clusters (Suppl. Fig. 1), that is computed based on solely HYSPLIT backward trajectory at 100 m above ground level and independent segregation of local AWS data by k-mean cluster algorithm. The wind direction in cluster 1 predominantly exhibited north-easterly flows, besides some intersection of south-westerly winds and vice versa (Suppl. Fig. 1a,b). The higher spread of wind speed PDF in cluster 2 (Suppl. Fig. 1b,d) between 0 to 8 m/s depicted higher variability in magnitude as compared to cluster 1. In spite of it, the mean difference in wind speed of both the clusters is ~2.5 m/s.

Cluster analysis of aerosol optical depth and wind

The influence of wind on total Aerosol Optical Depth (AOD) measured over the Arabian Sea is further analyzed using satellite and reanalysis data products. Figure 2(a,d,g,j) shows average spatial pattern of columnar AOD and wind field over the Arabian Sea during south-westerly monsoon (113 rainwater collection days) period. The MODIS Deep Blue (DB) and Dark Target (DT) combined product of daily total AOD at 550 nm showed very high values of AOD that reached up to unity in cluster 2 dominated days (Fig. 2c,f), whereas in cluster 1, there were almost less than half (~0.4) (Fig. 2b,e). This increase in cluster 2 is attributed to the major fraction being contributed through the aloft wind-blown dust layer loading14 and surface wind-generated marine sea salt fractions within the boundary layer. A similar agreement was seen between total AOD at 500 nm derived from OMI satellite (Fig. 2k,L), and MERRA2 (Fig. 2h,i) at 550 nm. The average spatial pattern of mean AOD variability in all the satellite and reanalysis products exhibited spatial resemblance for cluster 1, 2 and average AOD (total). In addition to it, the standard deviations (see Suppl. Fig. 2) of mean values for total AOD across all the platforms were found to be significantly low (~0.1 for cluster 1 and ~0.25 for cluster 2). However, in spite of cloud screened level 3 MODIS DB DT product in cluster 2 showed anomalous standard deviation as high as 1 near to the coast of Oman, and Pakistan. This high value is attributed to sporadic strong Shamal winds that merge with low-level jet at ~850 hPa, thus uplifting hefty dust layer across the ocean. This indicates that the strength (Suppl. Fig. 3) and direction of wind-blown aerosol substantially governs the columnar aerosol loading which in turn may also influence the rainwater concentration.

Figure 2
figure 2

Spatial map of averaged AOD and wind direction over the Arabian sea, for Total (first column), cluster-1 (second column) and cluster-2 (third column). The first (ac) and second row (df) represents MODIS-Terra and Aqua DB DT combined total AOD at 550 nm, overlaid by Era-Interim and NCEP reanalysis 10 m wind fields respectively. The third row (gi) shows MERRA2 total AOD (550 nm) and its 10 m wind fields. And last row (jL) corresponds to OMI-Aura total AOD at 500 nm superimposed by ASCAT METOP-A surface wind vectors.

The ASCAT obtained ocean surface wind speed was also found to be in good accord with ERA-Interim, NCEP and MERRA2 reanalysis (Suppl. Fig. 3) products. The maximum wind reached up to ~15 m/s in cluster 2 across the Arabian Sea basin (Suppl. Fig. 3c,f,i,l) and was found very conducive for sea spray initiation. However, the winds in cluster 1 (Fig. 2b,e,h,k) showed drastic weakening (Suppl. Fig. 3b,e,h,k) of strength and a slight shift in direction from south-westerly to north westerly towards the West Coast of Indian sub-continent was observed. This change in slight direction was earlier reported to modulate the rainwater concentration and chemistry drastically, due to the major metropolis that falls in the upwind direction to the orographic feature22.

Influence and generation of sea salt aerosol

In order to understand the contribution of sea salt aerosol to rainwater, we first compute average sea salt fluxes (Fig. 3a) and sea salt AOD (Fig. 3d) over the Arabian Sea for 2016 monsoon. The flux is defined as the rate of sea spray generation per unit area of sea surface per increment of particle radius. The sea salt source function parametrization scheme is based on empirical relationship between wind speed and shape factor2. The wind speed obtained from ASCAT is well validated with daily buoy in situ measurements (Suppl. Fig. 4). The spatial pattern (as shown in Fig. 3c) of ASCAT derived sea salt fluxes clearly indicates an excessive amount of sea salt droplet production in cluster 2 (~8 × 105 m−2μm−1 s−1) and covers most of the Arabian Sea basin. On the other hand, cluster 1 (Fig. 3b) indicates of very low amount of sea salt fluxes (~1 × 105 m−2μm−1 s−1). Moreover, based on wind index and calm wind state, an empirically derived relation for sea salt AOD23 varying latitudinally is also shown in Fig. 3b,c. The similar average spatial distribution of sea salt AOD is captured. The average sea salt AOD in cluster 1 (Fig. 3e) and 2 (Fig. 3f) are profoundly observed as near moderate (~0.05) to high (~0.2) in the east of the Arabian Sea respectively. The reason behind for secluded sea salt aerosol generation in east Somali coast between 5–20° N is because of intense cross-equatorial flow and surface winds during monsoon that is generally implicated in empirically derived fluxes and AOD. However, the source function for the amount of sea salt released into the ambient atmosphere does not account for the sink/loss and advection term. Thus, transport and transformation of sea salt are not ascertained by this fact but it essentially gives us vital information about sea salt production in cluster segregation. Hence, it is important to discern the fate of sea salt and other major aerosols (dust and sulphate) that includes source, sink and advection tendencies. Therefore, we explored NASA-MERRA2 and ECMWF-CAMS reanalysis products to ascertain the influence of different sources on the estimation of sea salt concentrations around the observation site.

Figure 3
figure 3

Spatial distribution of sea salt mass flux generation due to surface winds (total- a, cluster1- b and cluster2- c) by Gong 20032 approach. And Vinoj & Satheesh 200423 empirically wind derived sea salt AOD (total- d, cluster- e, and cluster2- f). The red dot depicts Mahabaleshwar-MBLR receptor location.

Transport of dust, sea salt and sulphate

Monsoon is a complex intermingling period of numerous sources that affect the rainwater and aerosol composition concomitantly. Figure 4(a,d,g,j,m,p) represents the average spatial pattern of dust, sea salt, and sulphate AOD at 550 nm respectively. The water-soluble inorganic daily rainwater concentration was noticed with highest percentage (~20–30%) contribution from Na+, Cl and Ca2+ (Suppl. Fig. 5a). The Ca2+ in rainwater is mostly contributed from mineral soil dust or from the sea salt in the form of CaCl2. These high percentages of Ca2+ were observed in both the rainwater cluster (1 and 2) samples. MERRA2 fine mode fraction (PM2.5 AOD/Total AOD) of dust and sea salt exhibited the presence of ~50% of finer dust (Suppl. Fig. 6a–c) and sea salt (Suppl. Fig. 6d–f) in both the clusters. However, in rainwater cluster 2, pre-dominance of Ca2+ to Na+ ratio (Suppl. Fig. 5b) was observed for most of the period with values extending beyond 1 and exorbitantly reached as high as ~5. Hence, such high values of Ca2+ are not expected in rainwater solely from sea salt.

Figure 4
figure 4

Spatial pattern of dust, sea salt and sulphate aerosol total extinction AOD at 550 nm from MERRA2 (dust ac; sea salt gi; sulphate mo) and CAMS (dust df; sea salt jL; sulphate pr) reanalysis. The yellow frame outline represents dust, blue outline depicts sea salt and light green outline indicates sulphate AOD for Total (first column), cluster-1 (second column) and cluster-2 (third column).

Hence, Saharan dust is an important crustal source of metals and non-metals such as Al3+, Fe3+, PO3−, Pb2+, Mg2+, and Ca2+ found in rainwater14,24. And during monsoon season, cluster 2 (as shown in Fig. 4c,f) displayed inflated values of dust as high as ~0.4 all across the Arabian Sea. This enormity of dust is viewed as a clear signature of strong mid-tropospheric south westerly flow, that carries wind-blown dust from across middle east countries to Indian sub-continent blending in with moisture-laden maritime clouds that precipitate at the observation site. Also, in cluster 1, mineral dust (Fig. 4b,e) was distinctly isolated to land mass region near Oman, United Arab Emirates and Saudi Arabia due to weakening of the winds, thus hampering its long-range transport. Both MERRA2 and CAMS showed consensus in capturing the spatial variability of AOD but latter weighted on the higher bias side.

The influence of long-range transported dust at mid-troposphere and near surface sea salt contribution to cloud is a unique association of aerosols that pertains during monsoon. Therefore, this complexity is further explored. Sea salt aerosols are found to be the most dominant species over the Arabian Sea in cluster 2 (Fig. 4i,l) as compared to cluster 1 (Fig. 4h,k) after dust. MERRA2 sea salt AOD values were found to be ~0.2, whereas CAMS AOD were as high as >0.4. The biases between MERRA2 and CAMS reanalysis are inherent features of the model that may be due to different modelling architecture and emission inventories. However, our focus here is to investigate whether the two state of the art data assimilation mimic the signals of aerosol properties in our clusters or not. Indeed it does. The percentage amount of SO42− aerosol present in the rainwater samples (Suppl. Fig. 5a) indicates low (~10%) daily percentage contribution except for cluster 1 followed days at departure monsoon period, where it rose up to ~30%. This increment and presence of sulphate AOD originating over the ocean is found to be as low as <0.1. Moreover, pronounced high sulphate AOD (Fig. 4n,o,q,r) can be seen over the land as relative to the ocean, this rise is majorly due to anthropogenic activities that include mostly industrial and anthropogenic emissions on the windward sides of the Western Ghats25 during the season. The elevated sulphate AOD (>0.3) visible in cluster 1 (Fig. 4n,q) north-westward to MBLR confirms this finding and its presence in rainwater with a slight increase in rainwater SO42− and NO3 concentration (Fig. 5g) is affirmed. It was also noticed that NSS sulphate (non-sea origin) was major contributor to rainwater SO42− and was found to dominate (~80%) in both the clusters, indicating highest fractional contribution from land whereas only ~20% from SS26 (shown in Suppl. Fig. 7), which also justifies for very low sulphate AOD observed over the Arabian Sea.

Figure 5
figure 5

Scatter plot of ionic rainwater concentration of Na+ (as reference species) vs cations (ac) and Na+ vs anions (df) in microequivalents per liter for 113 rainwater samples collected during 2016 monsoon rainfall at site HACPL, Mahabaleshwar. The grey shaded region signifies 95% prediction interval and red region indicates standard error of the slope. And volume-weighted average concentration with their weighted standard error55 (g) and pH, for all anions and cations in cluster-1 (cyan) and cluster-2 (blue).

In-situ measurements of rainwater species

The seawater ratios have been extensively used and still continue to be the important way of source apportionment to separate out the NSS and SS fraction of aerosol present in precipitated rainwater. However, our findings indicate that dilution of polluted continental air masses (from inland and long-range transport) would deviate the seawater ratios from the widely used methods with total assumption of Na+ and Mg2+ as reference species (see Table 1). One of the pathways is when unreacted sea salt-NaCl of oceanic origin reacts with the landmass pollutants such as; NSS SO42−, NO3 and organic acids to replace the Cl ions that may lead to Cl depletion in rainwater sample. It was observed that Cl % depletion was found to be ~30% and reached as high as ~50% at peak rainy period in cluster 2 (see Suppl. Fig. 8a) during monsoon. On the other hand cluster 1 showed very low ~5% depletion. The Cl depletion occurs through two main processes firstly when NO3 reacts with NaCl to form NaNO3 and Cl (gas)27; and secondly where NSS SO42− reacts with NaCl to form Na2SO4 and Cl (gas)28. Thus in an ideal condition of non-reacting NaCl of oceanic source the ratio of Na+/Cl should be ~0.85 (see Suppl. Fig. 9a). However, Na+/Cl ratios for cluster 2 were mostly found to be >1 giving evidence that Cl had depleted. Hence, under the assumption that parts of depleting Cl may be due to NSS SO42− or NO3 chloride depletion has been computed9 (see Suppl. Fig. 8) and surprisingly we saw (see Suppl. Fig. 9b) the number (~25) of rainwater samples having ratios of Na+/(NO3 + Cl) in cluster 2 was much higher than Na+ /(NSS SO42− + Cl) (see Suppl. Fig. 9), suggesting most of the Cl depletion was due to NSS SO42−. This was also validated by correlation coefficient analysis. For cluster 2 SO42− and Na+ were found to be very strongly correlated (R~0.95 - see Table 1), whereas the correlation between NO3 and Na+ was ~0.7. Surprisingly, for cluster 1 both SO42− and NO3 were moderately correlated with Na+ (R~0.5). Therefore under these depleting regime seawater ratios should be cautiously used. It is also good to mention the depletion of Cl was found to be higher in fine mode regime as compared to coarse mode, which is mostly related to their larger surface area distribution and longer atmospheric residence time9.

Table 1 Mahabaleshwar rainwater compositional linear regression and Pearson correlational analysis of cluster 1, 2 and total days.

As shown in Fig. 5(a–f), the cluster 1 rainwater species (Ca2+, Mg2+, K+, SO42−, HCO3) mostly fell out of the 95% prediction interval of total rainwater sample regression analysis and is presumably affected by continental air masses. Whereas in cluster 2, it was mostly seen to (Fig. 5a–f) fall within the intervals. The Ca2+/Na+ total slope value was found to be 25 times higher than the actual reported seawater ratios in the literature (shown in Table 2), which implies the existence of strong dust mineral (Ca2+) dissolution in rainwater samples. Similarly, cluster 2 and 1 displayed 28 and 21 times higher values respectively. We explicitly found that our rainwater regression slope value nowhere comes close to seawater ratios (see Table 2). However, clustered volume-weighted average (VWA) depicted a bit high values of NO3 and SO42− as 5.5 and 0.38 μequivalent/liter in cluster 1 as compared to cluster 2, this supports and shows that the anthropogenic intervention in rainwater from local inland air masses are evident. This desecration of sea salt in rainwater is due to numerous factors acting concurrently such as local anthropogenic sources, long-range transport, sea salt transformation in aqueous phase chemistry with other aerosols, which might lead to loss of sea salt fraction in due course of time and passageway to inland precipitation. Subsequently, VWA for Ca2+, Mg2+, Na+, and Cl were highest amongst all the other rainwater ionic species. More importantly Na+, Cl, Ca2+, and Mg2+ were 15.84, 19.03, 32.09 and 11.23 μequivalent/liter in cluster 1 and twice as much high in cluster 2 (37.16, 38.82, 43.03 and 24.36 μequivalent/liter) except for Ca2+, which was present in high amount in both the clusters. Despite pre-dominance of Ca2+ present in both the clusters further decrease in pH value (as seen in Fig. 5g) in cluster-1 (pH-5.3) confirmed an increase in anions majorly contributed by NO3 and NSS SO42− (Suppl. Fig. 7) of long-range transported origins29, as compared to cluster-2 (pH-6) with less acidic in nature. The presence of dust and marine aerosol round the duration near to the Western Ghats region was also validated with CALIPSO LiDar aerosol classification image (Suppl. Fig. 10). The snapshot of CALIPSO images at initial, mid and final phase of monsoon 2016, clearly showed the layer of marine aerosol overlaid by dust layer in the mid-troposphere persisted throughout the period.

Table 2 Comparison between slope values of cluster 1, 2, and total rainwater constituents with bulk seawater ratios.

PMF and local source apportionment

Further to ascertain the influence of different sources on rainwater composition, Positive Matrix Factorization (PMF; EPA-PMF 5.0)30,31 analysis was carried out. PMF is a multivariate bilinear model that deconvolutes a sample matrix data into a factor contribution and factor profile matrix. The uncertainty associated with the measurement was constructed following the approach by Anttila, 199532. PMF analysis was performed up to 9 factors and the optimum 4-factor solutions (Fig. 6c–f) was chosen on the basis of scaled residuals33,34 and factor profiles physical resemblances with the regional sources, also discussed in Suppl. Fig. 11. As can be seen from Fig. 6c, Factor 1 majorly comprised of SO42− and NO3 ion with 28% and 77% contribution respectively indicated of secondary inorganic aerosol source factor. On the other hand Factor 2 (Fig. 6d) mostly signified a combination of two different sources, that is wind-induced sea salt and long-range transported mineral dust mostly linked with characteristic ions such as: Na+ (81%), Cl (84%), Ca2+ (53%), and Mg2+ (50%). Moreover, Factor 3 (Fig. 6e) majorly constituted of K+ (64%), NH4+ (100%) and Mg2+ (42%), mostly attributed to biomass burning and fertilizer sources active during the season. In addition to it, the diurnal variation of biomass burning organic aerosol (BBOA, derived from the PMF analysis on PM1 aerosol), and non-refractory Cl from Aerosol Chemical Speciation Monitor exhibited morning hours (7–10 hr) peak high mass concentration at ~1.4 and ~0.2 μg m−3 respectively (shown Fig. 6b). This suggested a significant contribution of biomass burning emission to the ambient air34. Also, recent study34,35 for the same region had also documented the impact of wood burning on the aerosol number size as well as mass variability during the monsoon period. This type of freshly emitted particles was solely found to dominate in 70–100 nm particle size range and was recorded by the collocated scanning mobility particle sizer instrument, as can be seen in Fig. 6a the number concentration reached as high as ~2500 cm−3. The cyan window (Fig. 6b) indicates the automatic precipitation collector rainwater sample retrieval time and disdrometer net rain (mm) in those periods further suggest the encapsulation of these burning episodes in rainwater (due to scavenging) that may have inflict the rainwater concentration. Moreover the existence of Mg2+ along with K+, and NH4+ in same factor 3 was linked to agriculture fertilizer used for farming in this region. Ultimately Factor 4 is attributed as acid neutralization factor. Here in absence of NO3 from factor 4 (shown & discussed in Suppl. Fig. 12) suggest Ca2+ (24%) acted as an important neutralizing agent for SO42− (64%) ion in the atmosphere. The abundance (~2–4 times higher) of SO42− (see Fig. 5g) over NO3 in rainwater implied the availability of the SO42− ions in neutralisation process is way much higher as compared to the NO3 ions. As also seen in Suppl. Fig. 10, dust mostly prevailed regionally. Sulfur dioxide (SO2) emitted from the combustion sources (biomass or fossil fuel burning) may have in turn underwent heterogeneous oxidation and subsequently got neutralized by Ca2+ ion present in the cloud droplets. It is also important to mention that the heterogeneous phase uptake and oxidation pathways of SO2 are much faster36 than NOx.. And few studies have reported the basic nature of rainwater in this region with immense presence of Ca2+ in higher concentration37. This availability of Ca2+ in the cloud water fabricates more conducive environment for the heterogeneous uptake of SO2.

Figure 6
figure 6

Diurnal hourly variation of (a) number size distribution(cm−3), and (b) BBOA (Biomass Burning Oxygenated Aerosol), non-refractory chloride mass concentration (μg/m3) along with net rain amount (mm) at site HACPL, Mahabaleshwar during 2016 summer monsoon rainfall. The cyan shade represents tentative time period for daily APC rainwater sample retrieval after 24 hrs collection. The PMF factor profiles of 4 factors and their respective rainwater ionic concentration and percentage contribution to each factor (cf).

Conclusions

The present study utilizes rainwater composition data from June to October 2016 at Mahabaleshwar, a high altitude site situated in Western Ghats mountain region in peninsular India along with various other available satellite and reanalysis datasets to assess the impact of different sources on the presence of major inorganic water-soluble constituents of the rainwater. k-mean clustering algorithm was primarily applied to air mass back trajectories to segregate the point observation as well as satellite and model reanalysis datasets. Cluster analysis reveals that the region is mostly influenced by two types of clusters during south-west monsoon, one with more of continental influence (cluster 1) and another with oceanic influence (cluster 2). The cluster wise calculated sea salt source function unveils the higher sea salt generation associated with cluster 2 as compared to cluster 1. Similarly, the presence of dust is also found more in cluster 2 as evident from MERRA2 and CAMS reanalysis data. The identical scenario was also visualized from the rainwater composition data with higher volume-weighted average concentrations of Na+, Cl, and Ca2+ in cluster 2 as compared to cluster 1. Dust aloft and sea salt beneath over the measurement region depicted a perfect sandwich like condition which provides a continuous source for the rain-water constituents that show the limitations of the seawater ratio signatures which are far from standard marine values. In contrast, the secondary species like SO42− and NO3 showed reverse trend indicating the possible influence of anthropogenic emissions in cluster 1 as it travelled toward inland. Thus, the segregation of rainwater samples with respect to cluster could not improve the seawater ratios significantly. The studies further enlighten the limitation of using seawater ratios for calculating sea salt and non-sea salt concentration as most of the time it may overestimate the non-sea salt fractions. The present study outlines extensively how different sources add up to the rainwater chemistry and showed an alternative way of identification and representation of sources using k-mean clustering and positive matrix factorization analysis.

Method and Data

Air mass classification by k-mean clustering algorithm

In order to separate the strong sea salt dominating air masses as SS and NSS spells that should prominently reflect the signatures in rainwater. We incorporate k-mean clustering algorithm to group the HYSPLIT backward trajectories38 based on the minimization of Euclidean distance from the centroid with numerous iterations to converge to a final set of assigned clustered trajectories. The Euclidean distance and centroid convergence is computed as per the following equation.

$$\begin{array}{lll} & k=2, & [{k}_{1},{k}_{2}]\\ {C}_{1}=[{x}_{1},{y}_{1}], & {C}_{2}= & [{x}_{2},{y}_{2}]\end{array}$$
(1)
$${D}_{1}({x}_{o},{y}_{o})=\sqrt{{({x}_{o}-{x}_{1})}^{2}+{({y}_{o}-{y}_{1})}^{2}},{D}_{2}({x}_{o},{y}_{o})=\sqrt{{({x}_{o}-{x}_{2})}^{2}+{({y}_{o}-{y}_{2})}^{2}}$$
(2)

if D1 < D2, then re-iterated centroid for cluster

$${C}_{1}=[\frac{{x}_{o}+{x}_{1}}{2},\frac{{y}_{o}+{y}_{1}}{2}]$$
(3)

else D2 < D1, then re-iterated centroid for cluster

$${C}_{2}=[\frac{{x}_{o}+{x}_{2}}{2},\frac{{y}_{o}+{y}_{2}}{2}]$$
(4)

here, k is the number of assigned clusters, based on our best-case well separated 2 clusters (as shown in Suppl. Fig. 1a,b). Here C1 and C2 are randomly selected initial centroids of two clusters with position coordinates of [x1, y1] and [x2, y2] respectively. And [xo, yo] is the position coordinate for trajectory under scrutiny. Therefore, Euclidean distance is computed from two centroids for [xo, yo], and based on minimum distance D1 or D2 is assigned to its corresponding cluster k1 or k2. In addition to it, this minimum distance approach is further implemented for generating the new centroid as shown in Eqs. (3) and (4) for next Euclidean distance calculation. The number of re-iteration of centroid is done until two clusters converge minimizing the error distance and no further change in centroid is observed.

Zefir tool was utilized for performing cluster analysis39. At first, each trajectory is defined to be a cluster, which implies, for N trajectories, there will be N number of clusters. For the first iteration, for every combination of trajectory pairs, the cluster spatial variance (SPVAR) is calculated. SPVAR is defined as the sum of the squared distances between the endpoints of the cluster’s component trajectories and the mean of the trajectories in that cluster. Then the total spatial variance (TSV), (the sum of all SPVAR) is calculated. The pair of clusters combined are the ones with the lowest increase in total spatial variance. After the first iteration, the number of clusters is N-1.

$${\rm{TSV}}=\sum ({\rm{all}}\,{\rm{SPVAR}})$$
(5)
$${\rm{SPVAR}}=\sum ({\rm{all}}\,{\rm{trajectories}}\,{\rm{in}}\,{\rm{cluster}})\,[\sum ({\rm{all}}\,{\rm{trajectory}}\,{\rm{endpoints}})\,\{{\rm{D}}\ast {\rm{D}}\}]$$
(6)

Here, D is the distance between a trajectory endpoint and the corresponding cluster-mean endpoint. The iterations continue until the last two clusters are combined, resulting in N number of trajectories in one cluster. In the first few clustering iterations, the TSV increases rapidly, then the TSV increases slowly for much of the clustering (~constant rate), but at some point it again increases rapidly, indicating that the clusters being combined are not alike. This sudden change in TSV can be used as a tool to identify the optimum number of clusters40. The iterative step just before the large enhancement of TSV can be assigned as the optimum number of clusters. For the present study, 2 optimum clusters were estimated (Suppl. Fig. 13) and to gain the statistical robustness these estimations were based on total number of 65088 HYSPLIT backward trajectories (Global Data Assimilation System- GDAS meteorology at 0.5° × 0.5°- 24 ensemble, hourly) for 113 rainwater collection days.

In-situ Meteorological and other data used

Based upon vector computation, local AWS wind speed and direction for receptor HACPL site at Mahabaleshwar at 1 min temporal resolution was decomposed into u(xi) and v(yi) components as per below Eq. (7) and later k-mean algorithm was employed to it in temporal span and used as validation for HYSPLIT separated clusters.

$${x}_{i}={U}_{i}.\,\sin \,{(\Theta )}_{i},\,{y}_{i}={U}_{i}.\,\cos \,{(\Theta )}_{i}$$
(7)

Here Θi and Ui are wind direction and speed at ith time, similarly xi and yi are zonal-u and meridional-v components of wind.

Moreover, in-situ surface meteorological (Wind speed, direction, Relative Humidity-RH, Pressure and Temperature) data over the Arabian Sea during summer monsoon 2016 were obtained from OMNI (Ocean Moored buoy Network for north Indian ocean) buoy network, that is further used for validating satellite wind products. There are total 5 buoys deployed in Arabian Sea, and in the present study we have utilized the meteorological data of only 3 buoys (AD07-14.9°N, 69°E, AD08-11.7°N, 68.6°E and AD09-8.2°N, 73.3°E) because of discontinuous temporal coverage during the study period. The wind speed and direction sensor has a resolution of 0.1 m/s and 0.1° with an accuracy of ±2% and 1.5–4°. In addition, pressure, RH and temperature sensor has an accuracy of ±0.15 hPa, ±1% and ±0.3 °C respectively. A two-tier data processing is carried out to OMNI buoy datasets as quality control measure. These checks include value, range, position and time, stuck value, and spike tests. A detailed description of quality control for these data can be found at https://incois.gov.in/documents/argoQCmanuals/INCOIS-DMG-TR-01-2009.pdf.

Daily 24 hr. rainwater samples were collected at HACPL-site by an Automated Precipitation Collector (APC). Eigenbrodt NMO-191/E is an advance enclosed system precipitation collector with circular collection a surface area of 500 cm2. The raindrop impaction on the sensor opens up the lid and auto shut helps in preventing dry deposition during non-rainy periods. The water-soluble inorganic aerosol concentration in rainwater was analyzed by offline chromatography technique (Methrohm IC-850; Supp5&C4). The ionic balance of total anion equivalents was compared with total sum of cation equivalents in rainwater to validate the Ion Chromatography (IC), in order to obtain more accurate balance HCO3 ions were theoretically calculated and adjusted for anion equivalents41,42. Subsequently, for quality control of analytical results; the measured and computed conductivities43,44 (μS cm−1) were compared (see Suppl. Fig. 14).

The diurnal variation of daily total rainfall amount (mm) was analyzed by RD-80 impact disdrometer, which has a sensor that generates an electric pulse by impaction of falling raindrop hit on the styrofoam cone diaphragm of 50 cm2, that is further related to raindrop size by Marshal Palmer distribution. The disdrometer measures raindrop size spectra between 0.3 to 5 mm in every 30 secs frequency interval with an accuracy of ±5% of measured drop size. The BBOA (Biomass Burning Organic Aerosol) and non-refractory chloride mass concentrations (μg m−3) were acquired from ToF (Time of Flight) ACSM (Aerosol Chemical Speciation Monitor). ACSM measures non-refractory aerosol mass concentration of Organics, SO42−, NO3, NH4+, and Cl in between 50 nm to 1 μm size range with 10 min sampling interval. And its respective sensitivities are 0.06, 0.006, 0.007, 0.06, and 0.003 μg m−3. Additionally, GRIMM-Scanning Mobility Particle Sizer and Environmental Dust Monitor-number concentration (cm−3) in size range between 5.14 nm to 37μm was also utilized in this study. The Sample flow rate of the instrument was maintained to 1.2 L/min, ±3% constant due self-regulation and reproducibility varied ±3% of total measuring range.

Calculation of sea salt aerosol

We first estimate spatial sea salt flux density over the ocean by utilizing parametrization scheme of sea salt source function from Gong et al.2 as shown in Eqs. (8, 9). The daily ocean surface (10 m) wind speed during monsoon 2016, was retrieved from ASCAT (Advanced SCATterometer) on-board EUMETSAT METOP-A satellite at 25 × 25 km grid resolution. And it is assimilated into the source function to enumerate the sea salt fluxes. ASCAT is a vertically polarized active C-band radar that measures backscatter signals from the ocean surface with linearly transmitting signal at 5.225 GHz. The coverage is by two swaths in left and right to nadir; with two sets of three slant receiver antennae’s at ±45° broadsides. Various internal and external calibration checks along with quality control and validations45 are performed to ASCAT-A L2 wind speed to gain stable results46,47.

$$\frac{d{f}_{o}}{dr}=1.373\,{u}_{10}^{3.41}{r}^{-A}(1+0.057{r}^{3.45})\times {10}^{1.607{e}^{-{B}^{2}}}$$
(8)
$$A=4.7{(1+\Theta r)}^{-0.017{r}^{-1.44}},\,{\rm{and}}\,B=(0.433-logr)/0.433$$
(9)

The dfo/dr (particles m−2s−1μm−1) is the density function of sea salt generation, where u10 is 10-m wind speed. A and B are exponential parameters with Θ as adjustable shape factor and r is the drop radius. In addition, we also compute wind-induced sea salt AOD based on simple exponential relationship from Vinoj & Satheesh, 200423.

Satellite and reanalysis products

To understand the other rain dominating aerosol compositions along with sea salt, we have innocuously examined aerosol optical depth from reanalysis and satellite products. The Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2) is an enhanced version of NASA’s MERRA v1. The upgraded version includes better assimilation of meteorology and aerosol optical depth from the ground-based measurements and spaceborne remote sensing platforms48; the retrospective global data products are available since 1980 and are well validated with ground-based observations49. In the current study, we have used total, dust, sulphate and sea salt AOD at 550 nm from MERRA2 (M2T1NXAER) and Copernicus Atmosphere Monitoring Service (CAMS) reanalysis daily product. In addition to that, wind fields (M2T3NVASM) and fine mode fraction (PM2.5 AOD/Total AOD) of dust and sea salt is also obtained from MERRA2 reanalysis.

The European Centre for Medium-Range Weather Forecasts (ECMWF) CAMS reanalysis products is available for the period 2003–2017. The reanalysis product of atmospheric composition uses 4Dvar data assimilation technique and is notably assessed and validated on quarterly basis for stable output50,51. Along with it, we have utilized surface winds from ERA-interim and NCEP (National Centers for Environmental Prediction) reanalysis. The wind fields biases are proportional to altitude, that suggest low bias near to the surface52,53,54.

In addition, aerosol properties were obtained from numerous spaceborne polar-orbiting satellite sensors such as Ozone Monitoring Instrument (OMI- Aura), Moderate Resolution Imaging Spectroradiometer (MODIS-Terra, Aqua), and Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) onboard CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation) satellite. The MOD08_D3 is a level 3 quality assured and cloud filtered combined dark target and deep blue AOD product at 550 nm for land and ocean. Apart from Terra, rest of them belongs to A-train satellite constellation. MODIS-Aqua product MYD08_D3 is similar to MOD08_D3 except for the same orbital track as OMI-Aura and Calipso and equatorial passing at 1:30 local time. We have also included OMAERUVd_003 an OMI cloud screened level 3 AOD product at 500 nm. Moreover, to differentiate between different species (dust, marine, and smoke) of aerosol; backscatter profiles and their feature aerosol classification images were obtained from CALIOP.