Access to mass rapid transit in OECD urban areas

As mitigating car traffic in cities has become paramount to abate climate change effects, fostering public transport in cities appears ever-more appealing. A key ingredient in that purpose is easy access to mass rapid transit (MRT) systems. So far, we have however few empirical estimates of the coverage of MRT in urban areas, computed as the share of people living in MRT catchment areas, say for instance within walking distance. In this work, we clarify a universal definition of such a metrics - People Near Transit (PNT) - and present measures of this quantity for 85 urban areas in OECD countries – the largest dataset of such a quantity so far. By suggesting a standardized protocol, we make our dataset sound and expandable to other countries and cities in the world, which grounds our work into solid basis for multiple reuses in transport, environmental or economic studies.


Background & Summary
Motorized transport currently accounts for more than 15% of world greenhouse gas emissions 1 .As most humans live in urban areas and two-thirds of world population will live in cities by 2050 2 , mitigating car traffic in cities has become crucial for limiting climate change effects [3][4][5][6] .Daily mitigating is the main driver for passenger car use -about 75% of American commuters drive daily 7 -while alternative transport modes such as public transportation networks are unevenly developed among countries and cities 8 .
Over the last decades, various attempts to assess the environmental impact of car use in cities have emerged from multiple fields, ranging from econometric studies to physics or urban studies 13,[9][10][11][12] .A seminal result of transport theory, by Newman and Kenworthy 12 , correlated transport-related emissions with a determinant spatial criterion: urban density.Alternatively, Duranton and Turner 13 claimed that public transport services were to unsuccessful in reducing traffic, as transit riders lured off the roads are replaced by new drivers on the released roads.Such results, however, crucially lack both theoretical and empirical foundations 14,16,15,17 and new research 18 shows that the two main critical factors that control car traffic in cities are urban sprawl and access to mass rapid transit (MRT).
More generally, understanding mobility in urban areas is fundamental, not only for transport planning, but also for understanding many processes in cities, such as congestion problems, or epidemic spread 19,20 for example.But what is a good measure of access to transit?Studies have mainly focused on the number of lines or stops [21][22][23] , length of the network or graph analysis [24][25][26] .Few works 27,28,18 , however, have considered investigating catchment areas of MRT stations, i.e. looking at the share of population living close to MRT stations, for instance within walking distance.Such conditions have however proved to be essential in explaining commuting behaviours and mobility patterns 18 .
The most detailed definition of such catchment metrics is the People Near Transit (PNT), and originates from a 2016 publication from the Institute for Transportation and Development Policy (IDTP) 27 .It produces a rigorous dataset of the share of population living close to transit (less than 1 km) for 25 cities in the world (12 in OECD countries).However, definitions of urban areas and rapid transit systems in that dataset are multiple and need to be refined while the number of cities must be expanded.
Hence, in order to expand our global knowledge of urban mobility, we need a common, unified and universal definition of access to public transit as well as sound measures of such a quantity.In this paper, we clarify its definition and propose what is to our knowledge the largest dataset of PNT globally.
Our analysis uses functional urban areas (FUA) in OECD countries, a consistent definition of cities across several countries 29 .We restrict our measures to mass rapid transit, usually referring to high-capacity heavy rail public transport, to which we added light rails and trams.In our sense, mass rapid transit thus encompasses: • Tram, streetcar or light rail services.
• Subway, Metro or any underground service.
Buses are not comprised in that definition.In contrast with 27 , we do not exclude any form of commuting trains based on station spacing or schedule criteria.As we detail it in the Method section, we identify services and corresponding stops with the General Transit Feed Specification (GTFS), a common format for public transportation schedules and associated geographic information 30 .
Crossing open-access information from public transport agencies in OECD urban areas with population-grid estimates of world population 31 , we publish here a list of 85 OECD cities (see Fig. 1) for which we were able to compute the People Near Transit (PNT) levels defined as the share of urban population living at geometric distances of 500 m, 1,000 m and 1,500 m from any MRT station in the agglomeration: where d = 500, 1000, 1500.
We display on Tables 1 and 2 the 5 cities with easiest access to MRT (largest PNT) and the 5 cities with scarcest access to MRT (smallest PNT).
We also provide for each city the population grid-maps with corresponding MRT access level, i.e. grid-maps of MRT catchment areas at different distances with population in each grid.As an example, Fig. 2 shows the 1000 m catchment area of MRT stations in Paris.

Residential populations for FUA
Our analysis relies on the 2015 residential population estimates mapped into the global Human Settlement Population (GHS-POP) project 31 .This spatial raster dataset depicts the distribution of population expressed as the number of individuals per cell on a grid of cells 250 m long.Residential population estimates for the target year 2015 are provided by CIESIN GPWv4.10 39 and were disaggregated from census or administrative units to grid cells.
We downloaded population tiles that cover land on the globe in the Mollweide projection (EPSG:54009) and in raster format (.tif files).These raster data are made of pixels of width 250 m with associated value the number of people living in the cell.We processed the downloaded tiles with Python 3.7.6 37 and package gdal 3.0.2 38to convert the raster files into vectorized shapefiles.The resulting shapefiles are comprised of polygons with field value the population in each polygon.Since the polygonization process merges adjacent pixels with common value into single polygons, populations for each polygon must be recomputed from polygon area and density through the simple following rule: where Area pixel = 250 × 250 = 62 500 m 2 .This leaves us with a list of 224 shapefiles of population that cover land area on earth.
By intersecting the resulting shapefiles with OECD shapefiles delineating Functional Urban Areas (FUA) in OECD countries 29 (reprojected into Mollweide projection), we can build a population-grided dataset of cities in OECD countries.
These resulting files are the population substrates used for measuring population living close to MRT stations.

Extracts of MRT stations from GTFS files
A common and de facto standard format for public transportation schedules and associated geographic information is the General Transit Feed Specification (GTFS) 30 .
A GTFS feed is a collection of at least six CSV files (with extension .txt)contained within a .zipfile.It encompasses general information about transit agencies and routes in the network, schedule information such as trips and stop times and geographic information for stops (geographic coordinates).
The three main objects we require are: • Routes: distinct routes in the network of a certain type.A route is a (one-direction) regular line, for instance a metro or bus line.The route types we use are 30 : -Tram, Streetcar, Light rail.Any light rail or street level system within a metropolitan area.
-Subway, Metro.Any underground rail system within a metropolitan area.
-Rail.Used for intercity or long-distance travel.
-Cable tram.Used for street-level rail cars where the cable runs beneath the vehicle, e.g., cable car in San Francisco.
Our definition of MRT excludes bus and ferry types: -Bus.Used for short-and long-distance bus routes.
-Ferry.Used for short-and long-distance boat service.
• Trips: trips are associated to a route and define a particular and scheduled trip between specific stations.For instance, the first train of the day is a trip.
• Stops: stops are geographic locations of the stops, stations and their amenities within the transit system.Stops are organized into a parent station and their amenities (e.g.platforms or exits).
Joining in this order the four tables routes.txt,trips.txt,stop_times.txt and stops.txtlets us bind stops with their associated route types.We can thus discriminate between bus stops and metro stops and thereby limit to our definition of MRT.
In a nutshell each GTFS file can be processed to produce localized and route-typed stops.

Measure of People Near Transit (PNT)
In order to measure PNT within urban areas, we must bind transit systems with their respective FUAs.We need to retrieve -and merge -all available GTFS files pertaining to a specific urban area and make sure that no rapid transit agency is excluded in the process.
Most GTFS files for cities in the world are collected by the OpenMobility-Data platform 32 .For each city in our dataset, we cross-checked the OpenMobil-ityData with Wikipedia local network information 8 to ensure that we considered all agencies of rapid transit within the urban area.
For some European countries (Germany, France), GTFS files were not availaible on OpenMobilityData and have been retrieved from other sources 33,34 .We also note that GTFS format is not common in South Korea, Japan and in the United Kingdom where we only found GTFS data for Manchester area on OpenMobil-ityData 32 while we directly used station coordinates for London 40 .
We were thus left with a list of 85 urban areas in the world for which we had complete, reliable and extensive data.From route-typed stop coordinates within that dataset, we can extract MRT stops (excluding buses and ferries) and buffer -still using gdal -catchment areas for several distance thresholds: 500 m, 1 000 m and 1 500 m.Intersecting the resulting buffers with the populationgrided shapefiles gives us the total population living within catchment areas, that can be expressed as a share of the total urban area population resulting in the value of the PNT metric.Our results are shown in Online-only Table 3.

Data Records
The Data Record of PNT in OECD urban areas is available online on Figshare 36 .
PNT levels at distance thresholds: 500 m, 1 000 m and 1 500 m for the 85 Functional Urban Areas are shown on the Online-only Table 3.The list of transit agencies for each city is online along with PNT statistics (mrt_access.csv) 36.
We also provide, for each city, grid-maps of population at different distances from MRT(pops_close_to_MRT.zip) 36 .
The Tables read as follows: Basel urban area has 528811 inhabitants, of which 57.78% live within 500 m of a MRT station, 80.15% within 1 000 m and 86.96% within 1 500 m.

Technical Validation
The most thorough and exhaustive measure of PNT in urban areas in existing literature is a 2016 report from the Institute for Transportation and Development Policy 27 .To validate our results and our methodology, we compared them with those results.
Out of the 12 OECD cities considered in 27 , 11 are in our dataset: 5 in the United States, 2 in Spain, 1 in Canada, 1 in France, 1 in the United Kingdom and 1 in the Netherlands (see Table 4).Unfortunately, we found no data in the remaining city: Seoul.
Out of these 11 cities, we had at first glance similar results for only two cities: Chicago (13% for both) and Vancouver (19% vs 23%).The discrepancies observed for the other cases stem from different definitions of cities and from the different transit systems that were taken into account.While we work with Functional Urban Areas (FUA) only, the authors of 27 mix two different definitions of cities: FUA and urban cores.By applying our method to urban cores and not functional urban areas, we found the same or similar results for Barcelona, Madrid, Rotterdam and Washington (see Table 4).Also, the authors of 27 considered a definition of the LRT (Light Rail Transit) and Suburban Rail that depends on station spacing and schedule criteria.We didn't choose this definition and for Boston and New York, we had therefore to exclude suburban trains -while keeping the definition of FUA -in order to retrieve results similar to those of Table 4.In contrast, the study 27 took into account the Bus Rapid Transit for Los Angeles, that we decided to exclude.
Finally, in Paris the authors of 27 considered that the so-called RER trains were comprised in Suburban Rail, but not Transilien trains, while we included both systems in our analysis.
The conclusion here is that for similar definitions for cities and transit systems, we obtain similar results, validating our method and calculations.To facilitate comparison across future studies, we would recommend using the definition of cities given by Functional Urban Areas since it is very commonly used and already unified for OECD countries.Concerning transit systems, we think that it is more relevant and also verifiable to consider transit systems based on their types (Rail versus Road) rather that on spacing and schedule criteria that are specious and less universal.Hence, in comparing our results with results from the IDTP report 27 and after checking on Table 4 that our methodology is correct, we decided to keep our unmodified estimations for the considered cities, despite the discrepancies with 27 .
For other cities in the dataset we have unfortunately found no existing data to compare with.Thus, we hope for future research to test and expand our estimations and results.

Usage Notes
Easy code and hints are given on Gitlab 35 .We strongly recommand using GDAL 38 to handle geographic data with Python.

Figure 1 :
Figure 1: The 85 OECD cities for which we found data are mostly found in Europe and in North America 41 .

Figure 2 :
Figure 2: 1000 m catchment areas of MRT stations (in orange) in Paris functional urban area (boundaries are in black) 41 .

Table 1 :
41. Population Near Transit values: Share of population living within catchment area from a MRT station at thresholds 500 m, 1 000 m and 1 500 m.Top 5 cities with easiest (1000 m) access to MRT.

Table 4 :
27mparison of MRT Share from the IDTP report27with our estimations for 11 OECD cities. Discrepancies at first glance can be explained by different delineations of cities or transit systems.Applied on the same entities, results are similar.