Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

## Abstract

As mitigating car traffic in cities has become paramount to abate climate change effects, fostering public transport in cities appears ever-more appealing. A key ingredient in that purpose is easy access to mass rapid transit (MRT) systems. So far, we have however few empirical estimates of the coverage of MRT in urban areas, computed as the share of people living in MRT catchment areas, say for instance within walking distance. In this work, we clarify a universal definition of such a metrics - People Near Transit (PNT) - and present measures of this quantity for 85 urban areas in OECD countries – the largest dataset of such a quantity so far. By suggesting a standardized protocol, we make our dataset sound and expandable to other countries and cities in the world, which grounds our work into solid basis for multiple reuses in transport, environmental or economic studies.

 Measurement(s) people near transit • access to mass rapid transit Technology Type(s) digital curation • computational modeling technique Factor Type(s) OECD country Sample Characteristic - Environment city Sample Characteristic - Location Canada • Australia • Kingdom of Spain • Portuguese Republic • Baltic states • Austria • Poland • Czech Republic • Hungary • United Kingdom • contiguous United States of America • French Republic • Italy • Germany • Greece • Scandinavia • Benelux • Mexico • South America

Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.12770162

## Background & Summary

Motorized transport currently accounts for more than 15% of world greenhouse gas emissions1. As most humans live in urban areas and two-thirds of world population will live in cities by 20502, mitigating car traffic in cities has become crucial for limiting climate change effects3,4,5,6. Daily commuting is the main driver for passenger car use - about 75% of American commuters drive everyday (U.S. Department of Transportation, Bureau of Transportation Statistics, National Transportation Statistics. Table 1–41 at http://www.bts.gov (2016)) - while alternative transport modes such as public transportation networks are unevenly developed among countries and cities (List of Metro Systems, Wikimedia Foundation https://en.wikipedia.org/wiki/List_of_metro_systems, 2020).

Over the last decades, various attempts to assess the environmental impact of car use in cities have emerged from multiple fields, ranging from econometric studies to physics or urban studies7,8,9,10,11. A seminal result of transport theory, by Newman and Kenworthy10, correlated transport-related emissions with a determinant spatial criterion: urban density. Alternatively, Duranton and Turner11 claimed that public transport services were unsuccessful in reducing traffic, as transit riders lured off the roads are replaced by new drivers on the released roads. Such results, however, crucially lack both theoretical and empirical foundations12,13,14,15 and new research16 shows that the two main critical factors that control car traffic in cities are urban sprawl and access to mass rapid transit (MRT).

More generally, understanding mobility in urban areas is fundamental, not only for transport planning, but also for understanding many processes in cities, such as congestion problems, or epidemic spread17,18 for example. But what is a good measure of access to transit? Studies have mainly focused on the number of lines or stops19,20,21, length of the network or graph analysis22,23,24. Few works16,25,26, however, have considered investigating catchment areas of MRT stations, i.e. looking at the share of population living close to MRT stations, for instance within walking distance. Such conditions have however proved to be essential in explaining commuting behaviours and mobility patterns16.

The most detailed definition of such catchment metrics is the People Near Transit (PNT), and originates from a 2016 publication from the Institute for Transportation and Development Policy (IDTP)25. It produces a rigorous dataset of the share of population living close to transit (less than 1 km) for 25 cities in the world (12 in OECD countries). However, definitions of urban areas and rapid transit systems in that dataset are multiple and need to be refined while the number of cities must be expanded.

Hence, in order to expand our global knowledge of urban mobility, we need a common, unified and universal definition of access to public transit as well as sound measures of such a quantity. In this paper, we clarify its definition and propose what is to our knowledge the largest global dataset of PNT.

Our analysis uses functional urban areas (FUA) in OECD countries, a consistent definition of cities across several countries27. We restrict our measures to mass rapid transit, usually referring to high-capacity heavy rail public transport, to which we added light rails and trams. In our sense, mass rapid transit thus encompasses:

• Tram, streetcar or light rail services.

• Subway, Metro or any underground service.

• Suburban rail services.

Buses are not comprised in that definition. In contrast with25, we do not exclude any form of commuting trains based on station spacing or schedule criteria. As we detail it in the Method section, we identify services and corresponding stops with the General Transit Feed Specification (GTFS), a common format for public transportation schedules and associated geographic information (GTFS Static Overview. https://developers.google.com/transit/gtfs, 2020).

Crossing open-access information from public transport agencies in OECD urban areas with population-grid estimates of world population28, we publish here a list of 85 OECD cities (see Fig. 1) for which we were able to compute the People Near Transit (PNT) levels defined as the share of urban population living at geometric distances of 500 m, 1,000 m and 1,500 m from any MRT station in the agglomeration:

$${\rm{PNT(}}d{\rm{)}}=\frac{{\rm{population}}\,{\rm{s}}{\rm{.}}\,{\rm{t}}{\rm{.}}\,{\rm{euclidean}}\,{\rm{minimum}}\,{\rm{distance}}\,{\rm{ < }}\,d}{{\rm{total}}\,{\rm{population}}}$$
(1)

where d = 500, 1000, 1500.

We display on Tables 1 and 2 the 5 cities with easiest access to MRT (largest PNT) and the 5 cities with scarcest access to MRT (smallest PNT).

We also provide for each city the population grid-maps with corresponding MRT access level, i.e. grid-maps of MRT catchment areas at different distances with population in each grid. As an example, Fig. 2 shows the 1000 m catchment area of MRT stations in Paris.

## Methods

### Residential populations for FUA

Our analysis relies on the 2015 residential population estimates mapped into the global Human Settlement Population (GHS-POP) project28. This spatial raster dataset depicts the distribution of population expressed as the number of individuals per cell on a grid of cells 250 m long. Residential population estimates for the target year 2015 are provided by CIESIN GPWv4.1029 and were disaggregated from census or administrative units to grid cells.

We downloaded population tiles that cover land on the globe in the Mollweide projection (EPSG:54009) and in raster format (.tif files). These raster data are made of pixels of width 250 m with associated value the number of people living in the cell. We processed the downloaded tiles with Python 3.7.6 (Python Language Reference, version 3.7.6 available at http://www.python.org) and package gdal 3.0.2 (GDAL/OGR, Geospatial Data Abstraction software Library, Open Source Geospatial Foundation https://gdal.org) to convert the raster files into vectorized shapefiles. The resulting shapefiles are comprised of polygons with field value the population in each polygon. Since the polygonization process merges adjacent pixels with common value into single polygons, populations for each polygon must be recomputed from polygon area and density through the simple following rule:

$${{\rm{Pop}}}_{polygon}={{\rm{Pop}}}_{pixel}\times \frac{{{\rm{Area}}}_{polygon}}{{{\rm{Area}}}_{pixel}}$$
(2)

where Areapixel = 250 × 250 = 62500 m2. This leaves us with a list of 224 shapefiles of population that cover land area on earth.

By intersecting the resulting shapefiles with OECD shapefiles delineating Functional Urban Areas (FUA) in OECD countries27 (reprojected into Mollweide projection), we can build a population-grided dataset of cities in OECD countries.

These resulting files are the population substrates used for measuring population living close to MRT stations.

### Extracts of MRT stations from GTFS files

A common and de facto standard format for public transportation schedules and associated geographic information is the General Transit Feed Specification (GTFS Static Overview. https://developers.google.com/transit/gtfs, 2020).

A GTFS feed is a collection of at least six CSV files (with extension.txt) contained within a.zip file. It encompasses general information about transit agencies and routes in the network, schedule information such as trips and stop times and geographic information for stops (geographic coordinates).

The three main objects we require are:

- Tram, Streetcar, Light rail. Any light rail or street level system within a metropolitan area.

- Subway, Metro. Any underground rail system within a metropolitan area.

- Rail. Used for intercity or long-distance travel.

- Cable tram. Used for street-level rail cars where the cable runs beneath the vehicle, e.g., cable car in San Francisco.

Our definition of MRT excludes bus and ferry types:

- Bus. Used for short- and long-distance bus routes.

- Ferry. Used for short- and long-distance boat service.

• Trips: trips are associated to a route and define a particular and scheduled trip between specific stations. For instance, the first train of the day is a trip.

• Stops: stops are geographic locations of the stops, stations and their amenities within the transit system. Stops are organized into a parent station and their amenities (e.g. platforms or exits).

Joining in this order the four tables routes.txt, trips.txt, stop_times.txt and stops.txt allows us to bind stops with their associated route types. We can thus discriminate between bus stops and metro stops and thereby select objects according to our definition of MRT.

In a nutshell each GTFS file can be processed to produce localized and route-typed stops.

### Measure of People Near Transit (PNT)

In order to measure PNT within urban areas, we must bind transit systems with their respective FUAs. We need to retrieve - and merge - all available GTFS files pertaining to a specific urban area and make sure that no rapid transit agency is excluded in the process.

Most GTFS files for cities in the world are collected by the OpenMobilityData platform (https://transitfeeds.com, 2020). For each city in our dataset, we cross-checked the OpenMobilityData with Wikipedia local network information (List of Metro Systems, https://en.wikipedia.org/wiki/List_of_metro_systems) to ensure that we considered all agencies of rapid transit within the urban area.

For some European countries (Germany, France), GTFS files were not availaible on OpenMobilityData and had to be retrieved from other sources (GTFS für Deutschland https://gtfs.de/ and Open platform for French public data https://www.data.gouv.fr). We also note that GTFS format is not common in South Korea, Japan and in the United Kingdom where we only found GTFS data for Manchester area on OpenMobilityData (https://transitfeeds.com) while we directly used station coordinates for London (Transport for London. TFL Station Locations available at https://data.london.gov.uk/dataset/tfl-station-locations).

We were thus left with a list of 85 urban areas in the world for which we had complete, reliable and extensive data. From route-typed stop coordinates within that dataset, we can extract MRT stops (excluding buses and ferries) and buffer - still using gdal - catchment areas for several distance thresholds: 500 m, 1000 m and 1500 m. Intersecting the resulting buffers with the population-grided shapefiles gives us the total population living within catchment areas, that can be expressed as a share of the total urban area population resulting in the value of the PNT metric. Our results are shown in Online-only Table 1.

## Data Records

The Data Record of PNT in OECD urban areas is available online on Figshare30.

PNT levels at distance thresholds: 500 m, 1 000 m and 1 500 m for the 85 Functional Urban Areas are shown on the Online-only Table 1. The list of transit agencies for each city is online along with PNT statistics (mrt_access.csv)30.

We also provide, for each city, grid-maps of population at different distances from MRT(pops_close_to_MRT.zip)30.

The Tables read as follows: Basel urban area has 528811 inhabitants, of which 57.78% live within 500 m of a MRT station, 80.15% within 1000 m and 86.96% within 1500 m.

## Technical Validation

The most thorough and exhaustive measure of PNT in urban areas in existing literature is a 2016 report from the Institute for Transportation and Development Policy25. To validate our results and our methodology, we compared them with those results.

Out of the 12 OECD cities considered in25, 11 are in our dataset: 5 in the United States, 2 in Spain, 1 in Canada, 1 in France, 1 in the United Kingdom and 1 in the Netherlands (see Table 3). Unfortunately, we found no data in the remaining city: Seoul.

Out of these 11 cities, we had at first glance similar results for only two cities: Chicago (13% for both) and Vancouver (19% vs 23%). The discrepancies observed for the other cases stem from different definitions of cities and from the different transit systems that were taken into account. While we work with Functional Urban Areas (FUA) only, the authors of 25 mix two different definitions of cities: FUA and urban cores. By applying our method to urban cores and not functional urban areas, we found the same or similar results for Barcelona, Madrid, Rotterdam and Washington (see Table 3).

Also, the authors of 25 considered a definition of the LRT (Light Rail Transit) and Suburban Rail that depends on station spacing and schedule criteria. We didn’t choose this definition and for Boston and New York, we had therefore to exclude suburban trains - while keeping the definition of FUA - in order to retrieve results similar to those of Table 3. In contrast, the study25 took into account the Bus Rapid Transit for Los Angeles, that we decided to exclude. Finally, in Paris the authors of 25 considered that the so-called RER trains were comprised in Suburban Rail, but not Transilien trains, while we included both systems in our analysis.

The conclusion here is that for similar definitions for cities and transit systems, we obtain similar results, validating our method and calculations. In order to facilitate the comparison across future studies, we would recommend using the definition of cities given by Functional Urban Areas since it is very commonly used and already unified for OECD countries. Concerning transit systems, we think that it is more relevant and also verifiable to consider transit systems based on their types (Rail versus Road) rather that on spacing and schedule criteria that are specious and less universal. Hence, in comparing our results with results from the IDTP report25 and after checking on Table 3 that our methodology is correct, we decided to keep our unmodified estimations for the considered cities, despite the discrepancies with25.

For other cities in the dataset we have unfortunately found no existing data to compare with. Thus, we hope for future research to test and expand our estimations and results.

## Usage Notes

Easy code and hints are given on Gitlab (https://gitlab.iscpif.fr/vverbavatz/mrt-access-project).

We strongly recommand using GDAL (GDAL/OGR, Geospatial Data Abstraction software Library, Open Source Geospatial Foundation https://gdal.org, 2020) to handle geographic data with Python.

## Code availability

Detailed code generating the database can be accessed from the source code hosted via Gitlab (https://gitlab.iscpif.fr/vverbavatz/mrt-access-project).

## References

1. 1.

Herzog, T. World greenhouse gas emissions in 2005. World Resources Institute (2009).

2. 2.

United Nations, Department of Economic and Social Affairs, Population Division. World Urbanization Prospects: The 2014 Revision. Highlights ST/ESA/SER.A/352 (2014).

3. 3.

Dodman, D. Blaming cities for climate change? An analysis of urban greenhouse gas emissions inventories. Environ. Urban. 21, 185–201 (2009).

4. 4.

Glaeser, E. L. & Kahn, M. E. The greenness of cities: Carbon dioxide emissions and urban development. J. Urban Econ. 67, 404–418 (2010).

5. 5.

Oliveira, E. A., Andrade, J. S. Jr. & Makse, H. A. Large cities are less green. Sci. Rep. 4, 13–21 (2014).

6. 6.

Newman, P. G. The environmental impact of cities. Environ. Urban. 18, 275–295 (2006).

7. 7.

Creutzig F. et al. Global typology of urban energy use and potentials for an urbanization mitigation wedge. Proceedings of the National Academy of Sciences 112.20, 6283–6288 (2015).

8. 8.

Pumain D. Scaling laws and urban systems (Sante Fe Institute, 2004).

9. 9.

Barthelemy M. The structure and dynamics of cities. Cambridge University Press (2016).

10. 10.

Newman P. G. & Kenworthy J. R. Cities and automobile dependence: An international sourcebook (Gower Publishing, 1989).

11. 11.

Duranton, G. & Turner, M. A. The fundamental law of road congestion: Evidence from US cities. Am. Econ. Rev. 101, 2616–52 (2011).

12. 12.

Buchanan, M. The benefits of public transport. Nat. Phys. 15, 876 (2019).

13. 13.

Anderson, M. L. Subways, strikes, and slowdowns: The impacts of public transit on traffic congestion. Am. Econ. Rev. 104, 2763–96 (2014).

14. 14.

Litman, T. Evaluating rail transit benefits: A comment. Transp. Policy 14, 94–97 (2007).

15. 15.

Baum-Snow N., Kahn M. E. & Voith R. Effects of urban rail transit expansions: Evidence from sixteen cities, 1970–2000. Brookings-Wharton papers on urban affairs, 147–206 (2005).

16. 16.

Verbavatz, V. & Barthelemy, M. Critical factors for mitigating car traffic in cities. Plos One 14, e0219559 (2019).

17. 17.

Dalziel, B. D., Pourbohloul, B. & Ellner, S. P. Human mobility patterns predict divergent epidemic dynamics among cities. Proceedings of the Royal Society B: Biological Sciences 280, 20130763 (2013).

18. 18.

Balcan, D. et al. Multiscale mobility networks and the spatial spreading of infectious diseases. Proceedings of the National Academy of Sciences 106, 21484–21489 (2009).

19. 19.

Fouracre, P., Dunkerley, C. & Gardner, G. Mass rapid transit systems for cities in the developing world. Transp. Rev. 23, 299–310 (2003).

20. 20.

Gallotti, R. & Barthelemy, M. Anatomy and efficiency of urban multimodal mobility. Sci. Rep. 4, 1–9 (2014).

21. 21.

Gallotti, R. & Barthelemy, M. The multilayer temporal network of public transport in Great Britain. Sci. Data 2, 1–8 (2015).

22. 22.

Musso A. & Vuchic V. R. Characteristics of metro networks and methodology for their evaluation. National Research Council, Transportation Research Board (1988).

23. 23.

Gattuso, D. & Miriello, E. Compared analysis of metro networks supported by graph theory. Netw. Spat. Econ. 5, 395–414 (2005).

24. 24.

Derrible, S. & Kennedy, C. The complexity and robustness of metro networks. Physica A 389, 3678–3691 (2010).

25. 25.

Marks M., Mason J. & Oliveira, G. People near transit: Improving accessibility and rapid transit coverage in large cities, Institute for Transportation and Development Policy (2016).

26. 26.

Singer G. & Burda C. Fast Cities: A comparison of rapid transit in major Canadian cities (Pembina Institute, 2014).

27. 27.

Dijkstra L., Poelman H. & Veneri P. The EU-OECD definition of a functional urban area. OECD Regional Development Working Papers, 2019/11, Éditions OCDE, Paris (2019).

28. 28.

Florczyk A. J. et al. GHSL Data Package 2019. ISBN 978-92-76-13186-1 (Publications Office of the European Union, 2019).

29. 29.

Center for International Earth Science Information Network (CIESIN)—Columbia University. Gridded population of the world, version 4 (GPWv4): population density (2016).

30. 30.

Verbavatz, V. & Barthelemy, M. People Near Transit (PNT). figshare https://doi.org/10.6084/m9.figshare.12013020.v4 (2020).

## Acknowledgements

V.V. thanks the École nationale des ponts et chaussées for their financial support. This material is based upon work supported by the Complex Systems Institute of Paris Île-de-France (ISC-PIF).

## Author information

Authors

### Contributions

V.V. and M.B. designed the study, V.V. acquired the data, V.V. analyzed and interpreted the data, V.V. and M.B. and wrote the manuscript.

### Corresponding author

Correspondence to Marc Barthelemy.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

Reprints and Permissions

Verbavatz, V., Barthelemy, M. Access to mass rapid transit in OECD urban areas. Sci Data 7, 301 (2020). https://doi.org/10.1038/s41597-020-00639-3