Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Original Article
  • Published:

The verified neighbor approach to geoprivacy: An improved method for geographic masking

Abstract

Geographic information adds a powerful component to environmental epidemiology studies but can compromise subject confidentiality. Although locations are often masked by perturbing spatial coordinates, existing masks do not ensure that the perturbation area contains a sufficient number of valid surrogates to prevent disclosure, nor are they designed to minimize perturbation while maintaining a specified level of privacy. I introduce a new approach to geoprivacy in which real property parcel data with information about land use are used to develop a pool of verified neighbors. GIS (geographic information system) processing optionally restricts the pool to residences with values of environmental variables similar to those of the subject parcel. A surrogate is then randomly selected from the k members of the pool closest to the subject with k chosen to achieve the desired spatial privacy protection. The method guarantees the specified level of privacy even where population density is uneven while minimizing spatial distortion and changes to the values of environmental variables assigned to subjects. The method is illustrated with an example that found it to be more effective than random perturbation-based methods in both protecting privacy and preserving spatial fidelity to the original locations.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

References

  1. Brownstein JS, Cassa CA, Mandl KD . No place to hide – reverse identification of patients from published maps. N Engl J Med 2006; 355: 1741–1742.

    Article  CAS  Google Scholar 

  2. Curtis AJ, Mills JW, Leitner M . Spatial confidentiality and GIS: re-engineering mortality locations from published maps about Hurricane Katrina. Int J Health Geog 2006; 5: 44.

    Article  Google Scholar 

  3. Armstrong MP, Ruggles AJ . Geographic information technologies and personal privacy. Cartographica 2005; 40: 63–73.

    Article  Google Scholar 

  4. Duncan GT, Pearson RW . Enhancing access to microdata while protecting confidentiality: prospects for the future. Stat Sci 1991; 6: 219–232.

    Article  Google Scholar 

  5. National Research CouncilPutting People on the Map: Protecting Confidentiality with Linked Social-Spatial Data. Panel on Confidentiality Issues Arising from the Integration of Remotely Sensed and Self-Identifying Data In: Gutmann MP, Stern PC(eds.)Committee on the Human Dimensions of Global Change. Division of Behavioral and Social Sciences and Education. The National Academies Press: Washington, DC. 2007.

  6. Gutmann MP, Witkowski K, Colyer C, O’Rourke JM, McNally J . Providing spatial data for secondary analysis: issues and current practices relating to confidentiality. Popul Res Policy Rev 2008; 27: 639–665.

    Article  Google Scholar 

  7. Armstrong MP, Rushton G, Zimmerman DL . Geographically masking health data to preserve confidentiality. Stat Med 1999; 18: 497–525.

    Article  CAS  Google Scholar 

  8. Zandbergen PA . Ensuring confidentiality of geocoded health data: assessing geographic masking strategies for individual-level data. Adv Med 2014; 2014: e567049.

    Article  Google Scholar 

  9. Olson KL, Grannis SJ, Mandl KD . Privacy protection versus cluster detection in spatial epidemiology. Am J Public Health 2006; 96: 2002–2008.

    Article  Google Scholar 

  10. Kwan M-P, Casas I, Schmitz BC . Protection of geoprivacy and accuracy of spatial information: how effective are geographical masks? Cartographica 2004; 39: 15–28.

    Article  Google Scholar 

  11. Leitner M, Curtis A . A first step towards a framework for presenting the location of confidential point data on maps – results of an empirical perceptual study. Int J Geog Inform Sci 2006; 20: 813–822.

    Article  Google Scholar 

  12. Sweeney L . k-Anonymity: a model for protecting privacy. Int J Uncertain Fuzz 2002; 10: 557–570.

    Article  Google Scholar 

  13. Cassa CA, Grannis SJ, Overhage JM, Mandl KD . A context-sensitive approach to anonymizing spatial surveillance data: impact on outbreak detection. J Am Med Inform Assoc 2006; 13: 160–165.

    Article  Google Scholar 

  14. Hampton KH, Fitch MK, Allshouse WB, Doherty IA, Gesink DC, Leone PA et al. Mapping health data: improved privacy protection with donut method geomasking. Am J Epidemiol 2010; 172: 1062–1069.

    Article  Google Scholar 

  15. Duncan GT, Lambert D . The risk of disclosure for microdata. J Bus Econ Stat 1989; 7: 207–217.

    Google Scholar 

  16. New York State Office of Real Property Services. Real property parcel centroids. Albany, New York2004.

  17. New York State Department of Health.Water districts (unpublished data set). Troy, New York2006.

  18. Environmental Research Systems Institute ArcGIS Version 10.0. ESRI: Redlands, CA. 2010.

  19. Hampton K .pyDonutGeomask version 1.0. http://www.unc.edu/depts/case/BMElab/donutGeomask/pyDonutGeomask1.0.htmAccessed 26 December 2016.

  20. R Core Team 2014 R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria,URL http://www.R-project.org/.

  21. Kulldorff M . A spatial scan statistic. Commun Stat- Theory Methods 1997; 26: 1481–1496.

    Article  Google Scholar 

  22. Kulldorff M and Information Management Services, Inc. SaTScanTM v9.1.1: Software for the spatial and space-time scan statisticshttp://www.satscan.org;2011. SaTScan is a trademark of Martin Kulldorff. The SaTScan™ software was developed under the joint auspices of (i) Martin Kulldorff, (ii) the National Cancer Institute, and (iii) Farzad Mostashari of the New York City Department of Health and Mental Hygiene.

  23. Ripley BD . Modelling spatial patterns. J R Stat Soc Series B 1977; 39: 172–212.

    Google Scholar 

  24. Diggle PJ . Statistical Analysis of Spatial Point Patterns. Academic Press: London. 1983.

    Google Scholar 

  25. Baddeley A, Turner R . spatstat: an R package for analyzing spatial point patterns. J Stat Softw 2005; 12: 1–42version 1.40-0.

    Article  Google Scholar 

  26. Vision TJ . Open data and the social contract of scientific publishing. Bioscience 2010; 60: 330–331.

    Article  Google Scholar 

  27. National Institutes of Health.NIH Data Sharing Policies http://www.nlm.nih.gov/NIHbmic/nih_data_sharing_policies.htmlPublished 23 January 2013. Updated 31 January 2014. Accessed 16 February 2014.

  28. National Science Foundation.Dissemination and sharing of research results https://www.nsf.gov/bfa/dias/policy/dmp.jspAccessed 16 February 2014.

  29. Hanson B, Sugden A, Alberts B . Making data maximally available. Science 2011; 331: 649.

    Article  CAS  Google Scholar 

  30. Wieland SC, Cassa CA, Mandl KD, Berger B . Revealing the spatial distribution of a disease while preserving privacy. Proc Natl Acad Sci USA 2008; 105: 17608–17613.

    Article  CAS  Google Scholar 

  31. Clifton KJ, Gehrke S . Application of geographic perturbation methods to residential locations in the Oregon household activity survey. Transp Res Rec 2013; 2354: 40–50.

    Article  Google Scholar 

  32. Allshouse WB, Fitch MK, Hampton KH, Gesink DC, Doherty IA, Leone PA et al. Geomasking sensitive health data and privacy protection: an evaluation using an E911 database. Geocarto Int 2010; 25: 443–452.

    Article  Google Scholar 

  33. Kounadi O, Leitner M . Adaptive areal elimination (AAE): a transparent way of disclosing protected spatial datasets. Comput Environ Urban Syst 2016; 57: 59–67.

    Article  Google Scholar 

  34. El Emam K, Dankar FK . Protecting privacy using k-anonymity. J Am Med Inform Assoc 2008; 15: 627–637.

    Article  Google Scholar 

  35. Gymrek M, McGuire AL, Golan G, Halperin E, Erlich Y . Identifying personal genomes by surname inference. Science 2013; 339: 321–324.

    Article  CAS  Google Scholar 

  36. OpenStreetMap.www.openstreetmap.orgAccessed 26 December 2016.

  37. ParcelPoint. CoreLogichttp://www.corelogic.com/products/parcelpoint.aspx#container-OverviewAccessed 26 December 2016.

  38. Seidl DE, Paulus G, Jankowski P, Regenfelder M . Spatial obfuscation methods for privacy protection of household-level data. Appl Geogr 2015; 63: 253–263.

    Article  Google Scholar 

Download references

Acknowledgements

I thank Jay Nuckols for suggesting the work that led to this paper, Erin Bell for the inspiration to write the paper, Celine Barakat for her assistance with initial development of the method, and Jonathan Riehl for programming help. Tom Hart provided valuable review and comments. Early work on this paper was supported in part by grant U50/CCU223284-01 from the United States Centers for Disease Control and Prevention to the New York State Department of Health under their Environmental and Health Effect Tracking initiative. I appreciate comments from reviewers that led to improvements in the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wayne Richter.

Ethics declarations

Competing interests

The author declares no conflict of interest.

Additional information

Supplementary Information accompanies the paper on the Journal of Exposure Science and Environmental Epidemiology website

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Richter, W. The verified neighbor approach to geoprivacy: An improved method for geographic masking. J Expo Sci Environ Epidemiol 28, 109–118 (2018). https://doi.org/10.1038/jes.2017.17

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/jes.2017.17

Keywords

This article is cited by

Search

Quick links