INTRODUCTION

The number of studies on the relation between both long- and short-term air pollution exposure from road traffic and various adverse health effects continues to grow.1, 2 Over time, developments in computer technology and exposure modeling have advanced modern exposure assessment from the use of large spatial scale exposure estimates based on a few continuous monitoring sites3 to exposure estimates on much finer "local" scales, describing intra-urban, cross-sectional,4 as well as temporally resolved exposure variation.5, 6 The need for better modeling techniques is underlined by documented large intra-urban variations found in monitoring studies,7 and adequate exposure assessment is a crucial part of environmental epidemiology with direct influence on study validity.8

Meteorological dispersion modeling (DM) and land-use regression modeling (LUR) are alternative methods describing small scale variations in air pollution levels, and both have been documented to estimate urban outdoor concentrations of NO2 and NOx well.9 DM’s calculate the geographic distribution of air pollutants by combining the data on emission (point, line and area sources), the geophysical properties of the study area and meteorological conditions. LUR is a multiple linear regression technique using spatially dispersed monitoring and land-use data. In contrast to LUR, the DM can calculate concentrations at assigned locations at any time scale by adjusting for the interacting spatiotemporal effects of sources and meteorology.

It has been shown that LUR modeling may be enhanced by the use of time-varying traffic data and meteorological data such as temperature, relative humidity and wind data.10, 11, 12, 13, 14, 15, 16 These studies, however, did not explore the effect of other meteorological factors such as mixing height. At present, the benefits of a hybrid DM–LUR model compared with DM and LUR separately have been little investigated in a metropolitan setting. In addition, the potential to use LUR to investigate areas of improvement in DM has not been explored.

The aim of this study is to develop a hybrid spatiotemporal model for outdoor NOx levels in a large urban area, using NOx estimates from DM as well as land-use variables, meteorology and fixed monitoring data while adjusting for street canyon effects. We evaluated whether the hybrid model predicted better than either DM or LUR modeling developed separately, and could be used to identify potential improvement of both dispersion and LUR modeling.

Materials and Methods

Dispersion Models

Two different dispersion models have been applied to the greater Stockholm area (35 by 35km). Both used a detailed emission inventory for traffic sources, with annually updated traffic flows reported by the municipalities. This is a local road network database covering 90% of all the roads in Stockholm County, including information on traffic intensities for every road segment >500 vehicles/24 h averaged over 1 year. The proportion of heavy traffic (vehicles 3.5 tonnes) per road segment was estimated by the municipalities to be 4–10% of the total traffic depending on the road type, although up to 90% at some bus routes. The road network was digitalized by each municipality (n=26) separately using traffic counts for some streets and estimations on the remaining streets. Emission factors were calculated for street segments as the emission per vehicle and distance (NOx/vkm), based on the HBEFA model (http://www.hbefa.net/e/index.html), considering several vehicle-type-related factors such as the age, type and weight of the vehicle, but also the speed limits and the driving conditions. The number of vehicles at a street segment was adjusted for road-type-specific diurnal, weekly and monthly variations of traffic. These patterns were derived from actual traffic measurements on different typical road types.17 The inventory also included other sources,17 although the dominant source of NOx in Stockholm County was traffic,18 for which modeling was carried out in this study. To the estimates of both DMs, corresponding 2-week averages of rural NOx levels from a routine monitoring station were added. Both models were used to calculate 1-hour-average NOx concentrations at monitoring sites used for the Stockholm County part of the multinational project "European Study of Cohorts for Air Pollution Effects" and then averaged to correspond to the actual 2-week samplings at each site.

A multisource Gaussian dispersion model was used to calculate the urban background concentrations and for non-canyon traffic sites at a 500 m spatial resolution. The model is part of the SIMAIR modeling system used by the Swedish Meteorological and Hydrological Institute (Norrköping, Sweden http://airviro.smhi.se). Meteorological data were obtained from a 50 m high mast in a suburban district (Högdalen) in southern Stockholm, and these data were input to a diagnostic wind model to calculate a wind field over the whole model domain. Within the model domain, buildings, park trees and so on are parameterized as a rough surface that increases turbulence. It should be noted that the modeled values represent the average pollutant levels in a 500 m by 500 m area, whereas the monitored values are single points within these areas (individual streets or building effects are not resolved by this model).

The Gaussian model has been used extensively in epidemiological studies, describing long-term exposure concentrations on address level in Stockholm County.19, 20, 21 Furthermore, the model estimates have shown a high correlation with monitored annual exposure R2=0.74–0.80 over several years (1998–2005).22

To describe the NOx concentrations at street canyon traffic sites, the SIMAIR road model was used.23 The domain of the SIMAIR road model covered greater Stockholm, except the municipalities of Sundbyberg and Solna. This model calculates concentrations along individual streets with buildings on both sides of the street. Meteorological data were supplied to the SIMAIR road model from a system called MESAN (MESoscale Analysis), which makes use of all available measurement stations, and radar and satellites combined with a background field forecast providing near-real-time weather now-casting in 11 × 11 km meter grids.23, 24, 25 A brief description of the Mesan system may be found in the Supplementary Material.

Spatially Distributed Measurements

The spatially distributed observations of NOx were from the European Study of Cohorts for Air Pollution Effects project (ESCAPE). Within the ESCAPE project, a coordinated campaign for the monitoring of study-area-specific levels of NOx and other pollutants was organized in several European countries. On the basis of these measurements, area-specific LUR models were developed. The details of the measurements have been described elsewhere.7 Briefly, the monitoring campaign in Stockholm County was conducted from 1 December 2008 until 11 July 2009. The spatial variation of NOx was measured at 40 monitoring sites distributed to capture traffic-related exposure scenarios at home addresses in Stockholm County. Site-specific measurements were obtained for three biweekly periods. The choice of periods aimed to cover seasonal variations, and up to 10 sites were monitored simultaneously. NOx was measured using Ogawa diffusion badges.7 Geographical coordinates were attributed to each monitoring site by Lantmäteriet, the Swedish mapping, cadastral and land registration authority in charge of mapping property boundaries in Sweden.

For the purpose of this study, the following categories and inclusion criteria for the ESCAPE monitoring sites were successively applied:

  1. 1

    The site should be within the spatial domains of our DM models. Six sites were situated outside the area of the models. One of these is a rural background site (at Norr Malma, ca. 70 km north of Stockholm) and was used as an estimate of the non-urban source contribution to the modeled concentrations.

  2. 2

    To avoid a strong influence from single sites with very high traffic volumes, only sites with <100,000 vehicles/24 h on the nearest street were included (one site was excluded).

  3. 3

    Traffic sites should be located close to the road (15 m), and if situated on a building, 10 m above the street height. The facade should also face a street with 10,000 vehicles/24 h.

    1. a

      Street canyon sites were defined as having buildings on both sides of the street (one street canyon site was outside the domain of the SIMAIR model and therefore excluded).

The final data set thus included 31 sites: 11 traffic, 16 urban and 4 rural sites. Each monitored 2-week average was considered as one observation, and given the three monitoring periods, in total, 93 observations were used.

The NOx levels at street canyon traffic sites were calculated by the SIMAIR model. For all other sites, the Gaussian dispersion model was used. We additionally collected continuous NOx data from three stationary routine monitoring stations (STATMON) representing the regional background (Norr Malma, located in a rural area ~70 km north of Stockholm and 1.4 km from the nearest major road), urban background (Torkel Knutssonsgatan, located on a rooftop 25 m above street level in central Stockholm) and traffic (Hornsgatan, located in a canyon street with >30,000 vehicles/24 h). The geographic positions of monitors and directions of nearby streets can be found in Supplementary Figure 1. Rose plots indicating the distribution of wind direction in the Stockholm region can be found elsewhere.26 The measurements were provided by the Environment and Health Administration of Stockholm (www.slb.nu) and covered the same dates as the monitor-specific 2-week periods. Each biweekly mean included observations from at least 10 days.

The NOx concentrations observed at Norr Malma (regional background) were added to the calculated concentrations from DMs as the models only considered the contributions from the urban traffic sources. A 14-day period "delta urban NOx" predictor was calculated as the difference between Norr Malma (regional) and Torkel Knutsson (urban), and a "delta traffic NOx" predictor was calculated as the difference between Torkel Knutsson (urban) and Hornsgatan (traffic). The last two variables were offered as predictors in LUR modeling. Descriptive data for the stationary monitoring and STAT predictors can be found in Supplementary Table 1.

Additional Spatial and Temporal Data

The extraction and definition of the land-use data have been described elsewhere.27 Briefly, based on coordinates for the study-specific monitoring sites, predictor data were collected in a geographic information system (ArcMap 9). Predictors based on land-use and population data were created in the form of buffer zones around monitoring sites, whereas predictors based on traffic data were also based on distance from the site to the road.

The traffic variables for Stockholm County were primarily based on the road network provided by the Eastern Sweden Air Quality Management Association (www.slb.nu/lvf), that is, the same database used for the DM. For LUR, predictors were calculated as the inverse distance and the inverse distance squared to nearest road and nearest major road (m−1, m−2). The total length of roads (m) based on all roads and major roads only, were calculated in buffers of 25, 50, 100, 300, 500 and 1000 m radii. A major road was defined as a road with >5000 vehicles/24 h. The buffer sizes were selected as to describe near sources and sources of urban background levels.28 For the same buffer sizes, the "traffic load" on the nearest roads was calculated as the sum of the lengths of road segments multiplied by the traffic intensity attributed to each segment. The same calculations were then carried out using the heavy traffic intensities only.

To adjust for missing roads, particularly for rural locations, we complemented the road network with the Euro streets digital road network version 3.1. This road network is based on the TeleAtlas MultiNet TM from the year 2008. The MultiNet TM road network covers roads in Stockholm County with traffic intensities of <500 vehicles/24 h, but lacks information on traffic intensity. The additional road information allowed us to better estimate the distance from all monitors to the nearest road. Furthermore, fixed values of 500 vehicles/24 h and 0 heavy vehicles/24 h were attributed to these roads. These values were only used for predictors based on the distance to the road.

Land-use data were extracted from the CORINE (Coordination and Information on the Environmental program) land cover data 2000 (CLC2000), governed by the European Environmental Agency. The data were originally based on raster images from the Landsat-7 satellite, although used in vector form in this study. The minimum mapping unit (size of area vector) was 25 ha, corresponding example to a 500 × 500 m square. Final predictor variables covered urban green, seminatural areas, forest areas, high-density residential land, low-density residential land, industry and ports. Each predictor was based on the amount of surface area in buffer zones with radii of 100, 300, 500, 1000 and 5000 m. Population was modeled as the number of individuals within the buffer zones using a 100 × 100 m grid map with counts of citizens attached to each grid for the year 2005. The amount of surface water within buffer zones was registered using a terrain map from the Swedish mapping, cadastral and land registration authority for the year 2005 with an accuracy of +/−10 m.

Meteorological predictors (MET) used in the LUR modeling included temperature, relative humidity, global radiation and wind vectors. Wind vectors were computed as eastern and northern wind direction components together, with a separate variable for wind speed. All meteorological measurements were obtained at a stationary monitoring station positioned at rooftop level in central Stockholm (Torkel Knutssonsgatan). The MET predictors represented 2-week averages covering the same periods as the monitored NOx concentrations. A descriptive table of the MET predictors can be found in the (Supplementary Table 1 and Table 2).

Regression Model Development

Using multiple linear regression, two LUR models were developed, a standard LUR model — "LUR" based on the above mentioned spatially land-use variables and a LUR model denoted "LUR+MET+STATMON", also including the temporally defined MET and STATMON data. For both models, first a univariate ordinary least squares regression model was developed for each predictor. The model best explaining the observed variance (R2) was kept. To this model, all the remaining predictors were added separately using a repeated stepwise forward regression method, and in each turn, the predictor adding the most additional explained variance was included. Predictors entering the model had to add at least 1% explained variance while having a coefficient with the correct predefined direction of effect. Furthermore, the new predictor should not influence the direction of effect of other predictors.29 We allowed for more than one buffer size of the same predictor to enter the model. Whenever two of the same predictors were preserved, the larger buffer was rewritten into a "doughnut- shaped" buffer form in order to exclude the inner buffer area. From the final model, predictors with a P-value >0.1 (using two-tailed significance testing) were removed.

The hybrid model (DM+LUR+MET+STATMON) was based on the same predictor data and modeling technique as the LUR+MET+STATMON model, with the addition of DM exposure estimates offered as a potential predictor. As an intermediate step, a DM+MET+STATMON model was developed, similar to the above LUR+MET+STATMON model. For all models, the residuals vs fit plots indicated a random distribution of errors. There was however a general weak trend toward larger variance at higher-fitted values. The variance in the regression models was estimated with a cluster robust method30 to avoid the underestimation of variance because of repeated sampling. Potential multicolinearity between the predictors in the final models was investigated using the variance inflation factor test (VIF), however, all variables had a VIF <3 and were therefore kept in the models.

The performance of all models was assessed and compared by the model-specific proportion of explained variability (R2), the root mean square error (RMSE) and the best visual fit. To estimate the model robustness, leave-one-out cross-validation (LOOCV) was applied on all the models.7 Differences in model performance were also tested for statistical significance using the Wald test. To assess the degree of association between monitored NOx and predictors separately in the final hybrid model, partial R2 were calculated.

RESULTS

During a 12-month period including the monitoring campaign (Figure 1), the annual NOx level derived from stationary monitoring (biweekly standard deviation) in the rural background was 3.0, (SD 1.0) μg/m3, whereas the delta urban (rooftop less rural) was 12.0, (SD 5.0) μg/m3 and the delta street (street-less rooftop) was 100.0, (SD 21.0) μg/m3 (Figure 1). During our monitoring campaign, performed at a total of 93 2-week periods, the corresponding levels were 3.3 (SD 1.36) μg/m3 for rural, 12.3 (4.7) μg/m3 for urban-less rural and 102.4 (15.4) μg/m3 for street-less urban.

Figure 1
figure 1

Mean levels of daily NOx observed at a rural, urban and traffic site, and the daily mean global radiation during the years of the monitoring campaign in Stockholm County.

Comparing the spatially distributed measured NOx values with the DM shows that DM performed well with an R2 of 0.68, a RMSE of 12 μg/m3 and a regression slope of 1.14 (Table 1). At NOx levels below about 30 μg/m3, there was a tendency of overestimation, and for higher levels, a tendency of underestimation. The basic LUR model (without any temporal variables) explained 58% of the variability within the measured NOx (R2=0.58, RMSE 13.9 μg/m3) (Table 1). Similar to the DM, low levels were overestimated and high levels were underestimated (Figure 2). The predictors included in this model were population within a radius of 300 m and the total number of vehicles per 24 h at the nearest street (Table 1).

Table 1 Performance evaluation (coefficient of determination/root means square error and leave-one-out cross-validation) and model structures of the DM, LUR and hybrid model, explaining observed levels of NOx.
Figure 2
figure 2

Dispersion and land-use regression-modeled predictions of NOx concentrations related to 93 biweekly monitored NOx observations by univariate regression. DM, Airviro Gauss and SIMAIR road dispersion model; LUR, land-use regression model.

Both DM and LUR explained measured values significantly better when also temporal variables were included. In the DM+MET+STATMON model, the DM estimates were complemented with the meteorological predictor global radiation and delta urban NOx (urban–rural). Global radiation levels had a clear annual pattern with peak levels in the end of April until September (Figure 1). The inclusion of these predictors increased model performance (R2=0.82, RMSE=9.14 (Table 1, Figure 3)). The LUR+MET+STATMON model included similar variables: the delta urban NOx (urban–rural), traffic intensity on the nearest street, population within a radius of 100 m and global radiation. The model performance was also similar to the DM+MET+STATMON model, (R2= 0.80, RMSE=9.70, (Table 1, Figure 3)).

Figure 3
figure 3

Comparison of model-specific NOx predictions from three modeling scenarios; 1, dispersion modeling with additional information on global radiation, 2, land-use regression modeling including global radiation and 3, a hybrid model including dispersion modeling, global radiation and LUR components. DM, Airviro Gauss and SIMAIR road dispersion model; LUR, land-use regression model; MET, meterological variables, (global radiation); STATMON, stationary monitoring, delta urban NOx (urban–rural).

We found the hybrid model (DM+LUR+MET+STATMON) to perform better than any other model. The model captured 89% of the variance in the monitored concentrations (R2=0.89) and had the lowest model RMSE value (7.14). Furthermore, the predicted NOx estimates seemed more accurate across the whole exposure range (Table 1, Figure 3). The Wald test indicated that the difference in performance by the hybrid model compared with the DM+MET+STATMON and LUR+MET+STATMON was significant (P<0.01). The hybrid model included the following predictors: the DM estimates, traffic intensity on the nearest street, population within 100 m, global radiation and delta urban NOx (urban–rural) background (Table 1). Except for the DM estimate, traffic intensity on the nearest street was found to be the most correlated predictor according to the partial R2 (Supplementary Table 3).

The VIF test for multicolinearity did not indicate any strong correlation (VIF>3) between predictors in any of the models, although a strong correlation could have led to unreliable regression coefficients. In the LOOCV analysis, the models explained between 2–3% less variance, indicating good model robustness (Table 1).

DISCUSSION

We demonstrated the possibility to improve DM using a LUR framework and to evaluate areas of improvement in the DM. As expected, we found the final hybrid model to perform better than the DM and LUR models separately.

A similar result was found in another study,15 where the performance of a LUR–DM hybrid was compared with LUR and DM models at monitoring sites, describing biweekly near-road exposure gradients. The performance of the best hybrid model for NOx in that study was somewhat lower compared with our model (R2=0.71), but some differences between the hybrid models could be found. The dispersion model output was retrieved from a simplified version of the Caline 3 model and explained 26% of the variance in the data (compared with 68% in our study). Furthermore, the monitors were positioned as to exclude infrastructural influences on meteorology by high-rise buildings and street canyons, therefore omitting exposure scenarios for living conditions common to many people in the city centers and similar areas. Finally, in our model, LUR predictors based on meteorological data and stationary monitoring were also included. A similar hybrid modeling approach that included meteorological LUR data was performed in Boston.16 LUR modeling was complemented with traffic-derived NO2 levels calculated by a version of the Caline 3 model. The authors reported that the DM component improved the LUR models by 3–10% (cross-validated R2). The modeling differed to our approach by: not adjusting for street canyon effects, focusing on the winter season, assigning one fleet-wide-specific NOX emission factor for all vehicle and road types, and not including global radiation as a LUR predictor. Most of the variance (spatial and temporal) in the monitored data was covered by the DM, but the model had a tendency to overestimate the lowest NOx levels while underestimating high concentrations. A similar performance was found for the Dutch DM tool "CAR" when used to model annual small scale variations in NO2 in the city of Amsterdam.31 This model mostly underestimated the local traffic contribution and displayed the least accuracy for the highest concentrations. This was in part explained by the authors as a difficulty to model complicated traffic situations, such as, for example, often congested heavily trafficked roads. Several earlier studies have reported that real-world emissions are underestimated particularly for some types of vehicles.32 As clearly seen from the graph in Figure 3, our hybrid model performed well at all concentration ranges, possibly owing to the incorporation of very local land-use characteristics.

The addition of global radiation was found to be important. Global radiation is a measure of the incoming direct sunlight as well as diffuse and scattered light and is inversely related to the ground levels of air pollutants emitted below the boundary layer.33 In an earlier validation of the Gaussian dispersion models used here for urban sites (and open space traffic sites), discrepancies between monitored and modeled daily averages of NOx were proposed to relate to deficiencies in the model’s parameterization of mixing processes in the planetary boundary layer. For example, if the mixing is underestimated in the afternoon, the dilution of the emissions will be underestimated, resulting in too high estimates of pollutant concentration.34 The model with global radiation seemed to provide a better fit at low levels, but there were not observations enough to formally test whether global radiation is less relevant at higher concentrations, as for example, in highly trafficked street canyons. The performance of the LUR model (R2= 0.58) compared with the DM (R2=68) was good considering that only spatially related predictors were used to explain a 2-week average. However, we demonstrated that the model could be improved substantially by also including time-varying meteorological and routine monitoring data. In earlier LUR models, meteorological components such as temperature, relative humidity, wind speed, wind vectors and cloud cover were used on different time scales, but not global radiation.10, 11, 12, 13, 14, 35

The traffic- and population-based LUR predictors used in our hybrid model are commonly used in LUR modeling.36 The traffic predictor "traffic intensity on the nearest street" may reflect the difficulties to account for the influence of very near traffic lack or indicate an underprediction of vehicle emissions.

The predictor "population" has been described as a marker of air pollution variability related to the differences in urban and rural living environments including sources of traffic and home heating.37 In Stockholm, this may act as a marker of the amount of traffic in the neighborhood including public transport and commercial activities, but could perhaps also reflect other aspects of urbanicity, as for example, street configuration and use of off-road machinery.

Finally, a description of the temporal NOx variability on an urban scale "delta NOx (urban–rural)" was found to contribute significantly to the hybrid model. This may indicate that the dispersion-modeled 2-week mean levels did not fully incorporate the temporal changes of all NOx sources. The traffic-related NOx contribution is estimated to be about 60% in this region,38 but the measured urban background (less rural background) was only weakly correlated to the DM estimates that included traffic-related NOx sources and the rural contribution (R2=0.10, data not shown). Therefore, it is possible that the time variations in urban background (less rural background) represent influence of non-traffic NOx sources as energy production, off-road machinery and shipping. Our study indicates that future improvements in the DM could decrease exposure misclassification both for long- and short-term exposure assessment.

This type of hybrid model development needs DM, routine monitoring, land-use data and a campaign of spatially distributed measurements, and is therefore, data- and computation-intense. Ideally, spatial models trained on observed air pollution data should be evaluated with completely separate sets of observed air pollution measurement data. We did not have monitoring data for this kind of comparison and therefore used the LOOCV technique. The LOOCV method has been suggested to give overly optimistic R2 statistics compared with validations on external data sets for the same models, but the gap between validation and LOOCV R2 has been reported to be modest, when using 80 or more observations.39 One mechanism that may contribute to an inflated R2 is when a good model fit is obtained by including many predictors in relation to the number of observations for which variance should be explained by the model. It has been suggested that linear regression models may be free from such overfitting when the number of observations is 2–10 times the number of predictors, depending of type of predictor.40, 41, 42 Our models were developed on 93 observations from 31 different sites, during 12 different time periods (up to 10 sites could be monitored in parallel). In the final model, three spatial and three temporal predictors (DM counted in both categories) were included, while there might be some overfitting for the temporal variables, but probably not for the spatial variables. Some of the limitations of our study could be addressed in future work, for example, by using a denser monitoring to capture the differences in air pollution levels at especially traffic sites, considering also ventilation effects due to street orientation. Furthermore, as our and other models indicate the largest uncertainty exist in the tails of the modeled concentration range, an oversampling of extreme sites might prove helpful. A larger variation in sites would also provide the possibility to better define when to use DMs of different resolution across the study domain. Future studies might also gain from monitoring campaigns covering more of the temporal variations within one or between even several years. The quality of data is always paramount and more detailed data on traffic and heavy traffic intensities on streets with <500 vehicles/24 h could have improved our modeling. However, these streets are typically within residential areas, which is why we decided to attribute zero heavy traffic. Future studies could evaluate the uncertainties in the HBFA database by using measurements in controlled environments such as tunnels.

CONCLUSIONS

A hybrid spatiotemporal model, combining DM, local land use, and centrally monitored pollutants and meteorology, explained variation of 2-week average NOx concentrations within a metropolitan area significantly better than DM alone. This indicates that there is a potential for improving long-term estimates of air pollutant concentrations based on DM by incorporating further spatial characteristics of the immediate surroundings. In addition, our results suggest that the inclusion of data from routine air pollution monitoring and meteorology may improve both DM and LUR in spatially resolved short-term assessment.