Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Spatial identification and temporal prediction of air pollution sources using conditional bivariate probability function and time series signature


Accurate identification of distant, large, and frequent sources of emission in cities is a complex procedure due to the presence of large-sized pollutants and the existence of many land use types. This study aims to simplify and optimize the visualization mechanism of long time-series of air pollution data, particularly for urban areas, which is naturally correlated in time and spatially complicated to analyze. Also, we elaborate different sources of pollution that were hitherto undetectable using ordinary plot models by leveraging recent advances in ensemble statistical approaches. The high performing conditional bivariate probability function (CBPF) and time-series signature were integrated within the R programming environment to facilitate the study’s analysis. Hourly air pollution data for the period between 2007 to 2016 is collected using four air quality stations, (ca0016, ca0058, ca0054, and ca0025), situated in highly urbanized locations that are characterized by complex land use and high pollution emitting activities. A conditional bivariate probability function (CBPF) was used to analyze the data, utilizing pollutant concentration values such as Sulfur dioxide (SO2), Nitrogen oxides (NO2), Carbon monoxide (CO) and Particulate Matter (PM10) as a third variable plotted on the radial axis, with wind direction and wind speed variables. Generalized linear model (GLM) and sensitivity analysis are applied to verify and visualize the relationship between Air Pollution Index (API) of PM10 and other significant pollutants of GML outputs based on quantile values. To address potential future challenges, we forecast 3 months PM10 values using a Time Series Signature statistical algorithm with time functions and validated the outcome in the 4 stations. Analysis of results reveals that sources emitting PM10 have similar activities producing other pollutants (SO2, CO, and NO2). Therefore, these pollutants can be detected by cross selection between the pollution sources in the affected city. The directional results of CBPF plot indicate that ca0058 and ca0054 enable easier detection of pollutants’ sources in comparison to ca0016 and ca0025 due to being located on the edge of industrial areas. This study’s CBPF technique and time series signature analysis’ outcomes are promising, successfully elaborating different sources of pollution that were hitherto undetectable using ordinary plot models and thus contribute to existing air quality assessment and enhancement mechanisms.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Study area.
Fig. 2: Missing data.
Fig. 3: Summary plot.
Fig. 4: Box plots.
Fig. 5: Whisker plots.
Fig. 6: CBPF plot.
Fig. 7: Polar plots.
Fig. 8: Rose plot.
Fig. 9: Sensitivity analysis.

Similar content being viewed by others


  1. Seinfeld JH, Pandis SN. Atmospheric chemistry and physics: from air pollution to climate change. New York: John Wiley & Sons; 2016.

  2. Jiang D, Zhang Y, Hu X, Zeng Y, Tan J, Shao D. Progress in developing an ANN model for air pollution index forecast. Atmos Environ. 2004;38:7055–64.

    Article  CAS  Google Scholar 

  3. Murena F. Measuring air quality over large urban areas: development and application of an air pollution index at the urban area of Naples. Atmos Environ. 2004;38:6195–202.

    Article  CAS  Google Scholar 

  4. Jeričević A, Gašparac G, Mikulec MM, Kumar P, Prtenjak MT. Identification of diverse air pollution sources in a complex urban area of Croatia. J Environ Manag. 2019;243:67–77.

    Article  Google Scholar 

  5. Huang Z, Yu Q, Ma W, Chen L. Surveillance efficiency evaluation of air quality monitoring networks for air pollution episodes in industrial parks: pollution detection and source identification. Atmos Environ. 2019;215:116874.

    Article  CAS  Google Scholar 

  6. Salim I, Sajjad RU, Paule-Mercado MC, Memon SA, Lee B-Y, Sukhbaatar C, et al. Comparison of two receptor models PCA-MLR and PMF for source identification and apportionment of pollution carried by runoff from catchment and sub-watershed areas with mixed land cover in South Korea. Sci Total Environ. 2019;663:764–75.

    Article  CAS  Google Scholar 

  7. Potier E, Waked A, Bourin A, Minvielle F, Péré J, Perdrix E, et al. Characterizing the regional contribution to PM10 pollution over northern France using two complementary approaches: Chemistry transport and trajectory-based receptor models. Atmos Res. 2019;223:1–14.

    Article  CAS  Google Scholar 

  8. Tiwari A, Kumar P, Baldauf R, Zhang KM, Pilla F, Di Sabatino S, et al. 2441. Considerations for evaluating green infrastructure impacts in microscale and macroscale air pollution dispersion models. Sci total Environ. 2019;672:410–26.

    Article  CAS  Google Scholar 

  9. Pokorná P, Hovorka J, Hopke PK. Elemental composition and source identification of very fine aerosol particles in a European air pollution hot-spot. Atmos Pollut Res. 2016;7:671–9.

    Article  Google Scholar 

  10. Holmes NS, Morawska L. A review of dispersion modelling and its application to the dispersion of particles: an overview of different dispersion models available. Atmos Environ. 2006;40:5902–28.

    Article  CAS  Google Scholar 

  11. Qin Y, Oduyemi K. Atmospheric aerosol source identification and estimates of source contributions to air pollution in Dundee, UK. Atmos Environ. 2003;37:1799–809.

    Article  CAS  Google Scholar 

  12. Kim E, Hopke PK. Comparison between conditional probability function and nonparametric regression for fine particle source directions. Atmos Environ. 2004;38:4667–73.

    Article  CAS  Google Scholar 

  13. Malby AR, Whyatt JD, Timmis RJ. Conditional extraction of air-pollutant source signals from air-quality monitoring. Atmos Environ. 2013;74:112–22.

    Article  CAS  Google Scholar 

  14. Ashbaugh LL, Malm WC, Sadeh WZ. A residence time probability analysis of sulfur concentrations at Grand Canyon National Park. Atmos Environ. 1967;19:1263–70. 1985

    Article  Google Scholar 

  15. Henry R, Norris GA, Vedantham R, Turner JR. Source region identification using kernel smoothing. Environ Sci Technol. 2009;43:4090–7.

    Article  CAS  Google Scholar 

  16. Bae M-S, Schwab JJ, Chen W-N, Lin C-Y, Rattigan OV, Demerjian KL. Identifying pollutant source directions using multiple analysis methods at a rural location in New York. Atmos Environ. 2011;45:2531–40.

    Article  CAS  Google Scholar 

  17. Uria-Tellaetxe I, Carslaw DC. Conditional bivariate probability function for source identification. Environ Model Softw. 2014;59:1–9.

    Article  Google Scholar 

  18. Marmur A, Park S-K, Mulholland JA, Tolbert PE, Russell AG. Source apportionment of PM2. 5 in the southeastern United States using receptor and emissions-based models: conceptual differences and implications for time-series health studies. Atmos Environ. 2006;40:2533–51.

    Article  CAS  Google Scholar 

  19. Elangasinghe M, Singhal N, Dirks K, Salmond J, Samarasinghe S. Complex time series analysis of PM10 and PM2. 5 for a coastal site using artificial neural network modelling and k-means clustering. Atmos Environ. 2014;94:106–16.

    Article  CAS  Google Scholar 

  20. Khan J, Kakosimos K, Raaschou-Nielsen O, Brandt J, Jensen SS, Ellermann T, et al. Development and performance evaluation of new AirGIS–A GIS based air pollution and human exposure modelling system. Atmos Environ. 2019;198:102–21.

    Article  CAS  Google Scholar 

  21. Gulliver J, Briggs D. STEMS-Air: a simple GIS-based air pollution dispersion model for city-wide exposure assessment. Sci Total Environ. 2011;409:2419–29.

    Article  CAS  Google Scholar 

  22. Wang Y, Zhang X, Draxler RR. TrajStat: GIS-based software that uses various trajectory statistical analysis methods to identify potential sources from long-term air pollution measurement data. Environ Model Softw. 2009;24:938–9.

    Article  Google Scholar 

  23. Mukherjee A, Agrawal M. Assessment of local and distant sources of urban PM2. 5 in middle Indo-Gangetic plain of India using statistical modeling. Atmos Res. 2018;213:275–87.

    Article  CAS  Google Scholar 

  24. Ding H, Kumar KR, Boiyo R, Zhao T. The relationships between surface-column aerosol concentrations and meteorological factors observed at major cities in the Yangtze River Delta, China. Environ Sci Pollut Res. 2019;26:36568–88.

    Article  CAS  Google Scholar 

  25. Rana MM, Khan MH. Trend characteristics of atmospheric particulate matters in major urban areas of Bangladesh. Asian J Atmos Environ. 2020;14:47–61.

    Article  CAS  Google Scholar 

  26. Kang N, Deng F, Khan R, Kumar KR, Hu K, Yu X, et al. Temporal variations of PM concentrations, and its association with AOD and meteorology observed in Nanjing during the autumn and winter seasons of 2014–2017. J Atmos Solar Terrestrial Physics. 2020;203:105273.

    Article  CAS  Google Scholar 

  27. Jain S, Sharma S, Vijayan N, Mandal T. Seasonal characteristics of aerosols (PM2. 5 and PM10) and their source apportionment using PMF: a four year study over Delhi, India. Environ Pollut. 2020;262:114337.

    Article  CAS  Google Scholar 

  28. Althuwaynee OF, Pradhan B. Semi-quantitative landslide risk assessment using GIS-based exposure analysis in Kuala Lumpur City. Geomatics Nat Hazards Risk. 2017;8:706–32.

    Article  Google Scholar 

  29. Sanusi M, Ramli A, Hassan W, Lee M, Izham A, Said M, et al. Assessment of impact of urbanisation on background radiation exposure and human health risk estimation in Kuala Lumpur, Malaysia. Environ Int. 2017;104:91–101.

  30. Shakir SK, Azizullah A, Murad W, Daud MK, Nabeela F, Rahman H, et al. Toxic metal pollution in Pakistan and its possible risks to public health. Rev Environ Contam Toxicol. 2016;242:1–60.

  31. Sulong NA, Latif MT, Khan MF, Amil N, Ashfold MJ, Wahab MIA, et al. Source apportionment and health risk assessment among specific age groups during haze and non-haze episodes in Kuala Lumpur, Malaysia. Sci Total Environ. 2017;601:556–70.

    Article  Google Scholar 

  32. Khan MF, Hamid AH, Bari MA, Tajudin ABA, Latif MT, Nadzir MSM, et al. Airborne particles in the city center of Kuala Lumpur: origin, potential driving factors, and deposition flux in human respiratory airways. Sci Total Environ. 2019;650:1195–206.

    Article  CAS  Google Scholar 

  33. Tajudin MABA, Khan MF, Mahiyuddin WRW, Hod R, Latif MT, Hamid AH, et al. Risk of concentrations of major air pollutants on the prevalence of cardiovascular and respiratory diseases in urbanized area of Kuala Lumpur, Malaysia. Ecotoxicol Environ Saf. 2019;171:290–300.

    Article  CAS  Google Scholar 

  34. Shah AD, Bartlett JW, Carpenter J, Nicholas O, Hemingway H. Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am J Epidemiol. 2014;179:764–74.

    Article  Google Scholar 

  35. Sahrir S, Bachok S, Osman MM. Environmental and health impacts of airport infrastructure upgrading: Kuala Lumpur International Airport 2. Procedia-Soc Behav Sci. 2014;153:520–30.

    Article  Google Scholar 

  36. Awang MB, Jaafar AB, Abdullah AM, Ismail MB, Hassan MN, Abdullah R, et al. Air quality in Malaysia: impacts, management issues and future challenges. Respirology. 2000;5:183–96.

    Article  CAS  Google Scholar 

  37. Carslaw DC, Ropkins K. Openair—an R package for air quality data analysis. Environ Model Softw. 2012;27:52–61.

    Article  Google Scholar 

Download references


The authors gratefully acknowledge the financial support from the University Teknologi PETRONAS (UTP) STIRF research grant [0153AA-F83] for this project. Also, we are very grateful to Department of Environment (DoE), Malaysia, for providing the air quality data used in this study and the Federal Department of Town and Country Planning (PLANMalaysia) for providing spatial and attribute data of the study area.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Abdul-Lateef Balogun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Althuwaynee, O.F., Pokharel, B., Aydda, A. et al. Spatial identification and temporal prediction of air pollution sources using conditional bivariate probability function and time series signature. J Expo Sci Environ Epidemiol 31, 709–726 (2021).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links