Globally, poor air quality is estimated to cause about 8.8 million premature deaths per year1, it affects the cardiovascular and respiratory systems, and is also known to cause cancers and affect cognition2,3. It is the leading global environmental risk for human health. To tackle this problem and reduce the negative effect of pollution emissions, successful air quality management and control is needed. This requires not only measurements of air pollution levels, it also requires information on the sources of air pollution and their relative importance. Without this critical, targeted information on pollution sources, it is difficult to plan and enact control measures to reduce air pollution. For many years, source apportionment was conducted using research-grade instruments. Many studies have been conducted to develop methodologies for source identification and apportionment which can help in assessing the sources of pollution at a given site. Among the methodologies proposed, the most commonly used ones are Positive Matrix Factorisation (PMF)4,5 and the k-means clustering6,7. The cost and logistics associated with the measurements needed though makes their use scarce and time consuming, and typically limits their usage to the academic literature.

Over the last two decades there has been a revolution in air quality measurement using low-cost sensors (LCS). For PM measurements, these sensors typically use tried and tested methodologies that are used in regulatory-grade instruments, but are miniaturised. These sensors are becoming more dependable and have proven their capabilities in air quality monitoring8,9,10,11 and in several applications for which monitoring costs were previously a limiting factor12,13. To achieve lower costs, these LCS do not offer the same level of accuracy and need sophisticated statistical analyses and calibration to provide reliable results9,14,15,16,17.

To date, there have been huge numbers of studies that use LCS for the measurement of air quality but very few studies of their use in source apportionment. Until this point, LCS have been tested for source apportionment in background environments with either less complicated sources from greater distances (which made it easier to distinguish different pollution sources) or in cases where the major sources of pollution were limited in number18,19. In some cases LCS data were used with simpler methodologies mainly for pollution source identification8,20. Previously, we have shown the use of two statistical techniques (k-means clustering and PMF) that use the PM size distribution data measured by LCS, for source identification and apportionment. While there are great differences in the approach between the two methods, their outcomes complement each other in providing a clearer picture of the sources of pollution and the conditions that affect the extent of their impact18,21.

This study shows that LCS have great potential to identify local sources of air pollution in complex urban environments. We use k-means clustering and PMF analyses on the PM size distributions measured by low-cost optical particle counters in three distinct locations. Two locations are industrial in character: a construction site and a quarry. The third site is a roadside location. All locations are in central England, UK, and contain activities that require boundary line monitoring. We show that individual sources of PM can be identified and their contribution to overall PM concentrations can be apportioned. This study provides a low-cost methodology for PM source apportionment that will facilitate the pinpointing of PM sources thereby providing the required information for industries and regulatory bodies to reduce PM concentrations and achieve national and international air quality standards.


Results from Curzon Street, Birmingham, UK

The use of the two statistical methods on LCS data provided sufficient insight on both the sources that affect the air quality at Curzon Street site, as well as an estimation of the effect of each source. The results of the two methods are found in Supplementary Figs. 5 and 11 (Factors from the PMF analysis are marked as F followed by the factor number, while clusters from the k-means clustering analysis are marked as CL followed by the cluster number). Apart from the Particle Number Size Distributions (PNSD) profiles of the PMF factors, the temporal and wind variation of their G contribution (the relative contribution of each factor on the local atmosphere) of each factor is presented. Additionally, Table 1 shows the average meteorological conditions and PM concentrations of the clusters from the k-means analysis.

Table 1 Conditions and frequency of the clusters from the k-means analysis for all sites.

The effect of the activities from the construction site were found from both methods (CL1, CL3, CL4, CL5 and F2, F4). Specifically, the clustering method separated the effect of the construction site, which was more pronounced for particles of about 10 μm in diameter (Fig. 1). Similarly, using PMF, two hotspots of particle emissions were found, peaking at about 5 μm and having increased concentrations for larger particles (F4), as they presented greater contributions to the PM10 concentrations rather than the PM2.5 (Table 2). This is consistent with findings from previous studies on construction sites22,23. This source also presented a peak at the smallest available particle size, which in most cases points to increased particle concentrations at smaller particle sizes below the lower measurement size limit of the sensor. Looking at the temporal trend of this source, it is directly associated with the earth-moving works performed at the construction site (due to the PNSD profile, this can be associated with emissions from vehicles working at the construction site), though a small contribution of this source is also visible outside the period when such works were undertaken (according to the works diary provided by the constructor, earth moving works stopped on the 21st of September), which may be associated with resuspension from the same site. This factor presented the greatest contribution of PM10 in the area, reaching up to an estimated 60% on working hours (5:00–15:00 on weekdays) (Fig. 2). The effect on smaller PM sizes is significantly smaller (being about 5% in average, though more than double of that on working hours) pointing its association with larger particles associated with earth-moving activities taking place at the construction site. This effect is found to be greatly reduced outside the working hours.

Fig. 1: Variation of 11 μm diameter particles of the clusters in Curzon Street.
figure 1

ZPolar annulus of particles with diameter 11 μm for the clusters from the k-means clustering analysis (Curzon Street). Particles with diameters between 7 and 20 μm are found to be directly associated with the emissions from the works at the construction site. The effect from the works has been found in almost all clusters formed by the k-means analysis showing the significant effect of the specific source regardless of the atmospheric conditions. a CL1, b CL2, c CL3, d CL4, e CL5.

Table 2 Estimation of the PM1, PM2.5 and PM10 contribution (%) of each PMF factor.
Fig. 2: Characteristics of the factor associated with the industrial site in Curzon Street.
figure 2

Map of the Curzon Street construction site and the location of the measuring station (image by ©Google) (a). Diurnal variation of the G contribution (the relative contribution of the factor compared to the average contribution of the factor which is equal to 1) of the factor mainly associated with the works on the construction site (c). The polar plots on the right show the estimated PM10 concentrations attributed to the factor on non-working (b) and working hours (d). Working hours are considered between 5:00 and 15:00.

Another source of large particles associated with activities within the area and period of the earth moving activities was also pinpointed (F2), having measured peaks between 1 and 2 μm. This presented a wider area as a source location, though with a similar temporal variation as the F4, and is probably associated with other activities at the construction site. Nevertheless, the possible effect of other external sources occurring at the same time cannot be excluded. The effect of this factor in the PM load at the site was almost negligible for all PM sizes, though its contribution was doubled during working hours. A PNSD profile similar to this was found by Belkacem et al.24 to be associated with non-exhaust vehicle emissions (i.e. resuspension or tyre and brake wear), which may be the case in the present study as well.

While no significant differences were found in the PNSD using the k-means (Supplementary Fig. 9), the PMF analysis showed that the main source of particles <1 μm is to the southeast (Supplementary Fig. 8 and Table 2), where Birmingham city centre is located. The effect of this source is greater during night-time and early morning hours. This may be associated with the daily variation of the boundary layer height (BLH), which is reduced during these hours and tends to increase the effect of any local pollution source due to reduced mixing25,26,27. This source was pinpointed by both methods (CL2 and F1), and it was associated with the most polluted conditions, with respect to particle mass concentrations, for PM1.

Finally, the PMF also identified two more constant sources of particles (F3 and F5), which did not present a clear spatio-temporal variations. These are probably associated with background emissions related to activities from the urban environment or the greater area, which are not associated to the construction site according to their other characteristics. Among these two, F5 appeared to have a more significant contribution on larger PM (both PM2.5 and PM10), which may associate it with sources of particles from marine origins. These two factors did not present any variation during working hours further suggesting their background character.

The use of the clustering provides insight to the real-world conditions occurring at the site. Thus, while the separation of the sources that affect the air quality at the site is not as distinct as with the PMF, this method helps in better understanding the combination of the sources as well as their effect with different conditions. The small wind variation for the period studied resulted in rather homogenised clusters, with wind direction originating in the direction of the construction sites. All clusters carry the effect of the source to the southeast (the city centre and probably the railroad also located on that side), which affects the site throughout the day for the whole measuring period, though its effect is enhanced during the night hours, as mentioned earlier. While not clearly visible when considering PM measurements, the activities of the construction site are more visible when different particle size ranges are considered. It is found that for particles above 5 μm and above, two hotspots are observed that are similar to that in the PMF analysis (Fig. 2). The clustering method managed to further quantify the effect of these two hotspots (hence the large number of clusters associated with the works on the construction site), showing that the one on the northeast contributed a greater number of larger particles, not only having a larger PM10 concentration, but a smaller PM1 to PM10 ratio as well (Table 1). The latter though may be the effect of the sources of smaller particles found at the south side. This information though cannot be clarified without measurements of particles at smaller sizes than that the optical particle counter (OPC) provides, or without chemical composition data that were not available for this study.

Results from the Mountsorrel quarry in Leicestershire, UK

At the quarry in Leicestershire, UK, both methods (Fig. 3 and Supplementary Figs. 1215 and 1719) identified the background particle composition profiles (the particle composition profiles occurring when no significant local source affects the measuring site) and their variation with the conditions these occur (CL1, CL3, CL4 and F1). Additionally, they managed to identify three important sources of particles, one being associated directly with the works in the quarry (F2). This was separated into two parts, one being the crushing area of the quarry, located about 500 m southwest from the measuring site. Due to the great distance from the measuring site, its effect is important with stronger winds from that direction (CL6). Such conditions bring particles predominantly in the size range of 1.5 to 5 μm to the site and can increase the concentration of PM2.5 and PM10 up to three times compared to the average conditions when downwind (though this only occurred for 3% of the measurement period) (Table 1). This effect is confirmed by the PMF as well. Using the estimated PM concentrations for the factor F2, which is the factor directly associated with the work at the quarry, F2 is responsible for about 17% and 10% of the PM10 and PM2.5 concentrations at the site respectively (Table 2). As with the Curzon Street site, work on the construction site had a greater effect on the PM10 concentrations compared to PM2.5. These values are almost doubled with southwesterly winds (32% and 18% respectively), making the quarry an important source of such PM (the first factor though remains the greatest contributor of PM at the site), especially on working days (Fig. 3).

Fig. 3: Characteristics of the factor associated with the works at the Mountsorrel quarry.
figure 3

Map of the Mountsorrel quarry and the location of the measuring station (image by ©Google) (a). Diurnal variation of the G contribution (the relative contribution of the factor compared to the average contribution of the factor which is equal to 1) of the factor mainly associated with the quarry and the additional works area (c). The polar plots on the right show the estimated PM10 concentrations attributed to the factor on non-working (b) and working hours (d). Please note the difference in the scale. Working hours are considered between 6:00 and 17:00 on weekdays and 6:00 to 12:00 on Saturdays (according to the schedule provided by the quarry operator).

The second important source is at the area directly to the south of the measurement site located at a smaller distance compared to the crushing area, which was grouped together with the crushing part of the quarry. At this location other works take place which seem to contribute to a significant increase in the concentration of particles of smaller size, between 0.5 and 1.5 μm.

Additionally, the PMF managed to specify a source associated with marine origin at the site (F3), being responsible for a significant portion of the PM10 and PM2.5 at the site. This factor presented the same PNSD profile as the marine factors in the other areas of study, with two peaks at sizes <1 and 2 μm. While this factor was separated from the one associated with the quarry its effect is more significant with southwesterly winds (Supplementary Fig. 16). The PM concentrations of this factor are probably enhanced by the emissions from the quary as the incoming air masses of marine origin pass from the quarry area. Still, its contribution to the PM is smaller both in general and specifically with SW winds (about 14% and 19% respectively) compared to the factor that was directly assigned to the quarry.

Finally, an undefined source of particles which is probably located very close to the measuring site (as it is associated with rather low wind speeds) probably to the north and mainly during the night-time was also found. While with such low wind speeds it is hard to infer the source of a location with good precision, there was a persistent connection of the specific source with wind directions from the northern sector (CL5 and F4). The PMF analysis separated this source from the emissions from the nearby town to the north (though this does not exclude the possible connection to it, as wind speeds are lower during the night times), and the estimated effect of this factor was found to be rather low on the PM concentrations at the site.

From the results of the analyses, the quarry remains a notable but not the most important source of particles at the site during its working hours (according to the clustering analysis its effect exceeded the updated daily limit set by the World Health Organisation3 only for 12% for PM2.5, and 3% for the PM10 of the monitoring period), though the highest PM concentrations were observed with almost calm conditions and are probably associated with other sources (mainly the nearby town to the north). Its effect is separated into two parts at the site, and it is probably due to different works that take place in these areas which have a different effect on the PNSD. Specifically for the southwestern site (the crushing area), its effect is more pronounced when downwind and specifically with higher than average wind speeds, while its effect is a lot smaller with lower wind speeds, still increasing the PM concentrations but with average values considerably smaller than double those of the average conditions. Using the PMF the factor associated with the works at the quarry presented similar results, exceeding the limits set by WHO on only 5% of the hours of the measuring period spread in several days within the measuring period (using the estimated PM concentrations), though this effect is slightly increased when the effect of the marine factor which appears to be affected by emissions from the quarry is considered. The consistency of the results between the two methods is highly encouraging as it shows that both methods can provide meaningful and self-consistent results for air quality studies.

Results from the roadside site in Charlbury, UK

This case study investigated a location, a provincial roadside site in Charlbury UK, predominantly impacted by traffic emissions rather than industrial emissions. Once again, both k-means and PMF methods were applied on the dataset and their results are summarised in Fig. 4 and Supplementary Figs. 2026. It should be noted that, to the authors’ knowledge, roadside measurements using LCS were never analysed in this manner before, as many particles derived from traffic are mostly smaller in size than those measured by them.

Fig. 4: Wind profiles of the clusters at the Charlbury roadside site.
figure 4

Windroses of the five clusters identified via the k-means clustering analysis generated for the Charlbury roadside location. a CL1, b CL2, c CL3, d CL4, e CL5.

In this case, the clustering method managed to identify 5 unique conditions (Table 1) that lead to different particle profiles, separating night and day (CL1 and CL3), polluted (CL1, CL2 and CL4) and clean conditions (CL3) as well as separating a period when strong wind was blowing parallel to the road (CL5). The most polluted conditions were found when wind was blowing from the south (CL2), a side which coincides with the main area of Charlbury, without this excluding the effect of the traffic from the nearby road (junction). Contrary to this, particle concentrations were rather low when the wind was blowing parallel to the axis of the road from the north regardless of this occurring on a normal weekday (CL5). CL3 and CL4 were the most common clusters (about 70% of time combined), occurring almost every day (Supplementary Table 1) with higher frequency during midday and the rush hours respectively. CL3 is not greatly affected by the nearby traffic having low PM concentrations and PM1 to PM10 ratio, while CL4 is, as expected, greatly affected by traffic having the opposite characteristics (a high PM1 to PM10 ratio indicates a greater relative content of fine particles which, especially for such an environment, are probably associated with traffic).

The PMF separated three distinct particle profiles at the site. The main source that causes the greatest increase of particles at the site (F1), has increased contributions as particle size decreases. This factor is probably associated with the traffic from the nearby road as increased contributions of this factor were found with easterly wind directions, though the effect of the part of the town at that side cannot be excluded. This factor presents a rather balanced diurnal variation and is associated with the majority of the PM load for all size ranges, as found by the estimated PM concentrations (Table 2).

Another factor with similar PNSD profile but significantly lower contribution to the PM concentrations at the site was also found (F2), though this one presents its highest contributions with southerly winds. As mentioned earlier, while the town centre of Charlbury is located at that side, the possible effect of the nearby junction should not be overlooked. This factor presents a more distinct diurnal variation with greater contributions during the night and early morning hours, which may associate the variation of this factor with that of the BLH. While it has a similar PNSD profile with F1, using the estimated PM concentrations, this factor has a significantly greater effect on the PM2.5 and PM1 concentrations rather than the PM10.

Finally, one more source of larger particles (F3), which seems to have a more regional character and presented a more balanced contribution profile with wind directions from almost all sides. This factor presents a two-peak PNSD profile, with the peaks being at the same particle sizes as the ones assigned to the marine factor in the other study areas. This factor does not have any distinct wind direction or speed hotspots, though it is the only factor with its estimated PM concentrations increasing with increased wind speeds. This factor is associated with more than 20% of the estimated PM10 concentrations at the site though its contribution reduces for smaller PM sizes. The very high PM10 contribution found for this factor makes it unlikely for it to be solely attributed to marine sources. Solutions with higher number of factors were attempted, though they had a very small effect on the variation of this factor.

On the roadside site, the PMF when used on PNSDs and PM data was able to identify and separate rather distinct particle profiles, ones though that could not be confidently attributed to specific sources. This is probably due to the small measuring period, as greater datasets tend to smoothen and clarify the results of such analyses. The clustering method on the other hand provided a clearer picture of the conditions at the site according to the combination of particle sources, meteorological conditions, and time of the day (which is associated with different anticipated traffic densities). The uncertainty of the results from the PMF analysis in the specific site shows the need for either additional information (such as of the atmospheric chemical composition) or longer measurement periods for confident association of factors with sources. This will likely be the case for any analysis which includes sources that are associated with particles below the measuring limit of the LCS.


The low-cost methodology developed in this paper was able to identify the main pollution sources affecting air quality at the three case study sites. Low-cost PM size distribution data, provided by the OPC sensor, gives sufficient information for the k-means clustering and PMF analyses to identify and apportion various pollution sources. Furthermore, it was possible to estimate the contributions of the identified sources to overall PM mass concentrations in the regulatory PM2.5 and PM10 size ranges.

The three case studies provided complex urban environments in which to assess and apportion air pollution sources. The approach managed to effectively separate the effect of sources in the surrounding area, find the periods and conditions that affect their variability, and quantify the effect of the pollution sources depending on real conditions (weekday, meteorology etc.). The k-means clustering provides a clear image of the air quality at each site according to changes in the general air quality patterns, making it a reliable method for identifying the effect of near-constant sources, even with limited datasets. The PMF approach can identify and apportion distinct sources and estimate their contribution to total PM2.5 and PM10 mass concentrations.

The low-cost approach was particularly good at identifying and apportioning air pollution sources that have super-micron-sized PM contributions. The lack of information, from the OPCs, at particle diameters <0.38 μm makes identification of most regional sources more difficult, because their PNSDs of regional sources typically have characteristic peak PNSD features smaller than the OPC size cut-off. Hence, most regional sources were typically homogenised into a single source with little information about their origin. These sources appeared as a single blurred factor or cluster, with the PNSD profile peaking at the smallest particle size bin available. The low-cost approach will also likely miss local combustion sources, which are typically characterised by small particles. This omission will be important at roadside sites and other locations under heavy influence of internal combustion engines, as pointed by our study.

In all the areas studied, the PMF approach was able to identify one distinct regional source that was attributed to a marine factor. This source had a PNSD profile with super micron features and was observed across all three case study sites as well as a previous study investigating an urban background site, described in Bousiotis et al.14 (Fig. 5). This ability to identify distinct sources by PNSD profile is highly encouraging because it reduces the number of factors that need to be distinguished using the PMF analysis, as well as indicating that similar sources show similar profiles regardless of the local environment. This can be very useful in simplifying the source apportionment process, making such applications easier in the future and its wider use feasible, without the need of expert personel.

Fig. 5: The marine factor as identified at the studied sites.
figure 5

The marine factor was identified at the sites of the study and BAQS. A factor associated with marine sources has been identified at all the sites in this study as well as in a similar study done for the BAQS17. At all sites, it presents a unimodal PNSD profile with peaks at about 750 nm and 1.5 μm. Regardless of the difference in the particle counts (BAQS site result is multiplied by 10) which is affected by the local conditions, uniformity of the result is expected as all the sites are greatly affected by air masses originating from the Atlantic Ocean, which is the most common origin for incoming air masses in the U.K.39.

Currently, PM regulations are focused on PM2.5 and PM10. While the methods presented here may not be able to distinguish all the PM sources that affect air quality at a particular site, they can separate and approximately quantify the predominant ones, which in most cases contribute the most in the deterioration of the local air quality. The approach cannot assess total PM number concentrations, a metric that is becoming increasingly important from a regulatory perspective, which is dominated by particles of smaller diameter than that measured by the OPC. Going forward, low-cost particle number counters used in combination with the low-cost OPCs might offer good potential in this direction.

While the previous works18,21 focused characterising the capabilities of the methodologies of LCS source identification and apportionment, the present study shows that the methodology can be useful for industrial applications. Additionally, compared to the previous work, this study highlights the ability of PMF to provide quantitative results of the effect of different industrial activities on the air quality in the surrounding area. This is crucial for the assessment of the environmental footprint of such activities which was limited until now due to the high cost that came with such analyses. Additionally, the k-means showed the effect of the meteorological and temporal variations in air quality level which can be useful for applications such as urban planning or environmental alerts.

There is great potential for low-cost source identification and apportionment. By achieving local source apportionment in a lower cost manner, this study makes clear that PM source apportionment could be used more widely for regulatory and compliance purposes. By pinpointing individual pollution sources, the new methodology provides significant prospects to reduce overall PM concentrations through the targeting of specific sources, for example, nuisance dust from construction. Primarily, the technique should allow for greater use of source apportionment by significantly reducing the associated economic and logistical costs. Already, we believe that low-cost source apportionment can supplement regulatory grade source apportionment. Further field testing is now needed to see if low-cost source apportionment can replace regulatory grade source apportionment and in what situations this is appropriate. The low-cost technique should be particularly beneficial in boundary line source apportionment of point source polluters. The use of sensor arrays across urban areas in combination with the techniques described in this paper opens up the prospect of source triangulation to pinpoint precisely the location of emitters.


Aerosol counts and sizing

For the PM size distribution measurements at all sites, Alphasense OPC-N3 sensors ( were used, with each sensor costing approximately GBP250. The sensor is a small-sized Optical Particle Counter (OPC), with dimensions of 75 × 63.5 × 60 mm and weight under 105 g. The sensors measure particle number concentration in the size range of 0.35–40 µm in 24 bins, with a maximum particle count of 10,000 per second. The count and classification of the particles is achieved by measuring the scattered light by individual particles as they pass through a laser beam within the sensor. It has a minimum sampling time resolution of <1 s, though in the present study measurements were averaged into hourly periods. At the industrial sites, in a distance of <2 m there were regulatory instruments used to monitor the dust emissions and air quality either by the respective operator or DustScanAQ. In both cases the OPC was directly calibrated against the measurements from the regulatory instrument.

Measuring sites

Three distinct sites were used in this study, all of which are located within central England, UK, see Supplementary Fig. 1. The site in Curzon Street, Birmingham (52.48oN; 1.89oW), is located close to the construction works of a major new railway station and north of the Birmingham city centre. The construction site lies in a broad arc from the monitoring station, that extends from the south to the northeast of the monitoring station. Particle number size distribution data were collected for the period between 2/9/2020 to 26/10/2020.

The second site is the Mountsorrel quarry in Leicestershire (52.73oN; 1.16oW) and is located south of the nearby town. It is one of the largest quarries in Europe and comprises of the crushing site and several other facilities. Data were available for a 4-week period between 1st and 28th of July 2021 at a location northeast of the crushing area of the quarry and about 100 m north of the other work areas within the quarry site.

The third site in this study is a roadside location on Market Street (51.87oN; 1.48oW), a provincial road in Charlbury, UK. Though Charlbury is a rather small town (population of about 3000), the road is a relatively busy one traversing the centre of the town. The measuring period is the week between 7/10/2021 and 14/10/21.

The heights of all measurements were at ~2 m above ground level. For all the sites, particle number concentrations were measured at a 1 min resolution, then mean averaged to hourly intervals. Time periods were chosen such as no missing or below zero values were found in the datasets.

Meteorological data

For Curzon Street, data from the meteorological station at the University of Birmingham were used (the average wind rose for the measurement period is found in Supplementary Fig. 2). While the University of Birmingham meteorological station is located about 6–7 km away from the measuring site, the data provided are not biased by the local urban topography at Curzon Street, thus providing a better representation of the regional conditions in the greater area.

For the Mountsorrel quarry, met data was available from a met station located a couple of hundred metres to the west of the measuring station (the average wind rose for the period is found in Supplementary Fig. 3). Wind speed and direction measurements were made at 5 m height for this site.

Meteorological data were not available on the measurement site at Market Street in Charlbury, thus data from the nearest met station in Little Rissington (14 km west from the site) were used (the average wind rose for the period is found in Supplementary Fig. 4). No data that may identify individuals were collected for this study.

k-means clustering

The k-means clustering is a widely used source identification method, used in many studies in the past either on regulatory grade instrument6,28,29 or LCS21 data. It performs better than other clustering methods, as it produces clusters with greater consistency between their elements and better separation from the elements of other clusters30,31. k-means is a variable reduction method which partitions the observations into k sets, forming groups with the minimum possible variance (squared Euclidean distance) of the elements of each cluster32. The choice of the optimal number of clusters was informed by testing the different solutions with two metrics, the Dunn Index33 and the Silhouette width34 (Supplementary Fig. 27), as proposed by Beddows et al.30. When in some cases the choice of the best solution was not clear, the solution among the candidates that better described the conditions at each site was chosen.

Positive Matrix Factorization

The PMF is a multivariate data analysis method, commonly used for source apportionment in air quality studies, with numerous applications5,6,35. It describes the relationships between species measurements using a least-squares technique36. If X is the matrix of measured observed data with known experimental uncertainty (u), the method solves the X = GF + E bilinear matrix problem, where F is the matrix of the factors (sources), G is the factors’ contribution and E is the matrix of residuals. F and G are determined so that the Euclidean norm of E divided by u is minimised, and as the solution is constrained the elements of F and G are required to be non-negative37. As the PMF is a descriptive model with no objective criterion for the optimal number of factors38, the solution that better described the conditions at each site was chosen.

Calculation of the estimated PM concentrations

The estimation of the PM contribution for factor (i) is calculated as:

$${{PM}}_{{{est}}_{(i)}}=\,{{F}_{{PM}}}_{(i)}\times {G}_{(i)}$$

where the elements of F(i) (provided as a result by the PMF method) represent the mean concentration of each variable, in this case the PM concentration, in each factor (i) (when the elements of G(i) are normalised so that their mean values are equal to one37). G(i) is the relative contribution of the factor (i) in the local atmosphere at a given time which is averaged to 1 for the period of the measurement. This also takes into account the non-explained variance from all the factors combined, i.e. the variable’s variation not explained by the factors of the solution. While this method does not provide an accurate value for each variable (all variables of a factor are considered to fluctuate according to a single G contribution, attributed to its corresponding factor for the given period), it can be used as a reliable estimation of their variation. As each factor (source) is associated with a specific PNSD profile for which variation is mainly expected on their intensity rather than the particle concentration proportions within the factors, the estimations are expected to present a relatively accurate contribution of each factor on the PM concentrations.