Background & Summary

Cities for all, spatial accessibility, and travel times

Around the globe, cities increase their efforts to put their residents’ well-being at a priority. A socially just and equitable access to the services that allow people to live their everyday lives and participate in society1,2 is one of the key factors that contribute to urban ‘livability’3 and the opportunity of all city residents to ‘live a good live’4. In this context, spatial accessibility5,6 has become a core indicator in planners’ and decision-makers’ toolboxes to gauge and direct urban development. Building up on a solid research body and established concepts such as time geography7, accessibility has been used widely as a proxy for the quality of life a city provides to its general population8,9 and particularly to marginalised groups such as low-income families10, or older people11. The fundamental idea: that all residents should have an equal access to a city’s societal and everyday functions, is also strongly present in current planning discussions, including in the debate around the concept of the 15-minute city.

Most commonly, accessibility is measured through time. Travel times between residential locations, everyday services such as grocery stores or libraries, and places of work or school are an indispensable information for practitioners in urban planning, in transport planning, and in service network planning. While travel times do not equate directly into accessibility, travel times have become a core quantitative measure of potential mobility in cities. Recent advances in the efficiency of computational tools and availability of required data have allowed researches and practitioners to develop more sensitive travel time estimation models that account for differences between travel modes, differences between people, and differences between points in time.

Multi-modal travel time matrices

Travel time matrices are pre-computed data sets that contain the travel times between all points in a regular grid over a study area12. They thus reduce the need for on-the-fly computation, and can provide an instant overview over the accessibility landscapes of entire city regions.

Here, we present a travel time matrix for the metropolitan region of Helsinki, Finland, that accounts for multiple travel modes, at different times of the day, and considers interpersonal differences in terms of physical capabilities. The metropolitan region covers the area of four municipalities, Helsinki, Espoo, Vantaa, and Kauniainen, with a total population of around 1.2 million residents13. The built environment varies across the region: Helsinki’s city centre has a population density of up to 40,000 people/km² that quickly declines towards the outskirts13, which, just like Espoo, Kauniainen and Vantaa, are dominated by single-family housing and commuter towns built in the 1950s to 1970s14. Public transport is organised jointly in this region and comprises of a metro line, several commuter train lines and a dense network of feeder busses, trunk busses and tramways. There is a dense and highly diversified road network, including motorways, as well as a network of high-quality cycling and walking paths that cover the region.

We recorded travel times between the centroids of each of 13,132 grid cells in the region of the Finnish Environment Institute Syke and Statistic Finland’s 250 × 250 meter grid database15. The new data set carries on the ideas and concepts presented by Salonen and Toivonen in 201316 and Tenkanen and Toivonen in 201812: the scope has widened, the entire routing engine and methodology have been renewed and modelling has been based on new, more realistic, measurements of walking, cycling, and driving speeds, while ensuring backwards compatibility for comparability.

We computed travel times for 4 different travel modes for the 13,132 × 13,132 origin-destination pairs with varying parameters, accounting for a total of 2.2 billion rows of result data (i.e., trips). The travel modes, described in detail below, consider walking, cycling, public transport, and car, at rush hour, midday, and night-time driving speeds and transit schedule, and at two different walking and three different cycling speeds. This is to increase representation beyond ‘the average person’12,17, and is based on an understanding of mobility that we developed together with older people in co-creation workshops held during the URBANAGE project18. Models of slower walking speeds further acknowledge seasonal patterns present in cities with pronounced snow cover and slippery winter conditions19.

Except for car speed correction coefficients and some of the cycling speeds, all input data are either open data sets or have been derived from open data. They are available in identical or similar format for other cities, to ensure easy replicability and transferability. The Python package used, r5py20, and the underlying Java library, R521,22,23, are released under liberal open-source licenses. Our tool-chain is available in a source code repository, and as a Docker container (see Code availability, below). As such, our methodology is readily transferable and can be easily reproduced and applied to other cities, metropolitan areas, and more rural regions, elsewhere. We have tested replicability within the framework of the URBANAGE project (https://urbanage.eu) by computing travel time matrices for the cities of Ghent, Belgium, and Santander, Spain.

Our guiding principles have been to produce comparable, relatively simple, reproducible, and transferable methods and data sets for analysing spatial accessibility that are sensitive to temporal and interpersonal variation and serve the needs of both researchers and practitioners.

The resulting data set is available for download in a diverse range of formats and has been in active use by several administrative departments of the City of Helsinki, as well as at the Finnish Environment Institute Syke, among other stakeholders in Finland.

Methods (Travel time computation)

Input data

For computing the travel time matrices, we used a set of openly available data sources.

The street network, used in the computation of the walking, cycling, and car travel times, as well as for access, egress, and transfer to, from, and between public transport legs, is based on OpenStreetMap (OSM) data (https://planet.osm.org/). OSM is the primary source of GIS data the Helsinki Region Transport Authority (HSL) uses24, and is actively maintained and updated by the City of Helsinki. OSM’s road network representation is considered on par with commercial products. To model the network at a precise date and time, we created a workflow that uses osmium25 to extract the OSM network at a defined point in time from a full history database.

As for the public transport schedule, we used the openly available data set that is provided by the Helsinki Region Transport Authority (HSL) in General Transit Feed Specification (GTFS) format, describing routes, stops, and schedules of the local regional public transport network. The routing engine used (see below) natively supports this standardised file format; transport authorities and public transport operators in many cities and regions around the world use this format. Similarly to the street network, we devised a workflow that finds and downloads the precise GTFS data set valid on the date chosen for analysis from Transitland (https://transit.land/), an online repository providing a history of GTFS schedule data sets for many operators worldwide.

As a reference grid, and as origin and destination locations for the travel time models, we used the 250 × 250-meter grid from the Monitoring System of Spatial Structure and Urban Form (Yhdyskuntarakenteen seurantajärjestelmä, YKR) established by the Finnish Environment Institute and Statistics Finland Syke (https://stat.fi/org/avoindata/paikkatietoaineistot.html#tilastoruudukko-250m). This data set, commonly known as the YKR grid, is available for the entire country, and is compatible with various statistical data provided by Statistics Finland. It thus enables the combination of the travel time results with other sociodemographic and environmental data16,17. We snapped the centroids of all grid cells to the walkable network, with a maximum snapping distance of half a diagonal of the grid cell. To account for the snapping, the time needed to walk its linear distance was added to the overall trip travel times. Unsnappable centroids, i.e., those further than 353.5 m from the walkable road network, were discarded from the computation.

To better model the local variation in cycling speeds, we obtained two data sets that we consider representative of different cyclists in the city of Helsinki: (1) an open data set (https://www.hsl.fi/en/hsl/open-data#journeys-made-by-city-bikes) of the local bicycle sharing system (BSS), and (2) data from Strava Metro (https://metro.strava.com/), a fitness application. The BSS data contains details on 3.1 million trips made between April and October 2019, including average speeds. Strava data was downloaded from the Strava Metro dashboard, where we selected monthly data for September 2022 for all ride activities (commute and leisure). We then summarised the data per OSM edge to obtain average speeds cycled. Strava Metro requires registration and approval, but is free of charge for research that benefits regional decision making.

Similarly, to account for the actually driven speed of cars, we obtained a sample of floating car data that recorded road links and the average speeds driven on them at different times of the day (TomTom TrafficStats, https://www.tomtom.com/products/traffic-stats/). We computed coefficients between prescribed speed limits and driven speeds per urban form type that then could be applied to all street segments in the study area. We followed the approach presented by Perola26. This is the only data set used in producing the travel time matrices that is not openly or freely available for research purposes.

Door-to-door approach

All travel times in our data set are based on trips that follow a door-to-door approach, i.e., they include the overall time required for a trip from an origin to a destination. For instance, for public transport, the times include walking to a stop and waiting in-between connections. For car routing, the travel times include the time needed to park and walk to and from the parking spot, and for cycling the time needed to unlock and lock the bicycle.

Diurnal variations

Driving times are highly sensitive to congestion levels that vary throughout the day. In a similar manner, but with reversed effects, public transport times are affected: during rush hours, shorter intervals provide faster connections (granted that the network is not overloaded). To account for these diurnal variations, we computed travel times separately for the morning rush hour (08:00–09:00), midday (12:00–13:00), and nighttime (01:00–02:00), for cars and public transport. Our computation considers the according schedules and the measured car speed coefficients for each of these times.

Routing with different travel modes (door to door)

For all routing, we used the r5py Python package20, which we developed specifically for the purpose of producing travel time matrices for Helsinki and other cities. The tool is modelled after r5r27; similarly to this R library, r5py provides a convenient Python interface to the R5 routing engine21,22,23 that is realised in Java.

With the exception of public transport routes, which are computed using a multi-criteria RAPTOR algorithm23,28 using GTFS schedule data, all routes are calculated using an A* algorithm29,30 on a street network derived from OSM. This is the default behaviour of the R5 routing engine.

Public transport

To find public transport routes, we searched for all connections between all origins and destinations, departing at every minute of a one-hour window, and recorded the first percentile travel time (the fastest connection in this time window). The algorithm followed a door-to-door approach, using walking as the mode for access, egress, and transfer between stops. The distances for access and egress are capped at 2,000 m, for transfer between stops at 1,000 m. The model was computed for different walking speeds to represent the different ability of different city residents.

Private car

Travel times for cars are sensitive to actually driven speeds, as opposed to speed limits, to account for congestions and road design. We calculated speed coefficients using floating car data for each road segment based on urban structure and prescribed speed limit. We replaced the turn penalties built into the R5 engine with edge traversal times calculated using these speed coefficients. Indeed, our model can more realistically represent typical driving speeds in different types of roads (e.g., residential streets, highways), as described in detail in the section on validation and evaluation, below.

As with the other travel modes, we followed the door-to-door approach for private car routing. For car travel, this included walking times to and from the car, and the time needed to search for a free parking spot, which varies greatly in different parts of the metropolitan area. We used the values identified in a survey31 of typical parking times in the Helsinki Metropolitan area, categorised by municipality and urban form.

Cycling

For cycling routes, we retained the logic implemented in the R5 engine that prefers cycling infrastructure over car lanes and adds penalties for stop signs and traffic lights. On top of that, we assigned cycling speeds to each street segment according to three different categories of cyclists: (1) to model ‘fast’ cyclists, we extracted average cycling speeds reported on the fitness tracker app Strava, (2) for ‘slow’ cyclists, we computed a network-wide average of cycling speeds reported in the Helsinki bike-sharing system and applied the per-edge corrections obtained from the Strava data set, and (3) for ‘average’ cyclists we used the mean values between the two. The network-average speeds were 19.89 km/h, 16.41 km/h, and 12.93 km/h for fast, average, and slow cyclists, respectively. Finally, we added a flat penalty for unlocking and locking a bicycle (30 seconds each32).

Walking

For computing walking routes, we retained the logic implemented in the R5 engine that adds penalties for road crossings, traffic lights, and busy roads. In the Helsinki region, OSM provides a comprehensive and detailed walking network, which also includes footpaths in parks and in other areas without motorised traffic. As with cycling, we computed travel times for ‘average’ and ‘slow’ walkers, separately, at 5.15 km/h and 3.43 km/h, respectively. These values are based on the walking speed measurements carried out by Willberg et al.19, who found these values to describe typical walking speeds of the adult population in summer conditions and of older people in winter conditions, respectively.

Data Records

The data set is available for download at Zenodo (record 11220980)33.

Available file formats

The data is available in multiple different formats that cater to different requirements, such as different software environments. All data formats share a common set of columns (see above) and can be used interchangeably.

  • (A) A comma-separated text file (CSV), without geometries. This data set contains all travel times in one file and can be filtered by origin or destination according to the analysis at hand. The file contains all data columns as described in Table 1, no geometry, and can be joined to the geometries data set (D).

  • (B) A set of 13,132 CSV files containing the travel times to one destination grid cell from all other grid cells. The files contain all data columns as described in Table 1, no geometry, and can be joined to the geometries data set (D).

  • (C) An OGC GeoPackage (GPKG) file containing all data columns as described in Table 1 (travel times) and the geometries that relate to the destination grid cell of each trip.

  • (D) GPKG and ESRI Shapefile files of the spatial grid, containing the geometries and IDs of the 13 132 grid cells used as origins and destinations in the analysis. This file can be joined with the data files both using the ‘from_id’ and ‘to_id’ columns.

    Table 1 Data columns contained in the travel time matrix data set.

Technical Validation

We carried out a systematic validation of travel times for all travel modes. For this, we randomly sampled 100 points over the study area, thus selecting 9,900 trips between them, for each mode. We then queried travel times for the same origin-destination pairs and travel modes from the Google Directions API (https://developers.google.com/maps/documentation/directions/overview), at the same departure times (rush-hour, midday, nighttime). While we do not consider the travel times reported by this API as ground truth, the routing engine is technologically independent from r5py and informs its results from trip data recorded by the numerous users of the service.

Figure 1 and Table 2 report the results of these comparisons, which overall have high congruence. However, some observable differences between our dataset and Google Directions arise, which can be explained by different methodological nuances:

Fig. 1
figure 1

Comparison between the travel times computed by our approach and the travel times retrieved from the Google Directions API for different modes of transport with different parameters. Panels a-d show a comparison of the distribution of absolute values of travel times, panels e-f the distribution of pairwise differences (from same origin to same destination).

Table 2 Differences (in minutes) between the mean travel times reported by our travel time matrix and the Google Directions API.

The differences in cycling time (13 minutes on average, if assuming Google Directions routes for an ‘average’ cycling speed) cannot be explained by our door-to-door approach alone, that adds time for unlocking and locking a bicycle. Rather, our model’s cycling speed seems to divert from the undisclosed assumptions of the Google Directions routing engine, that also does not seem to prioritise cycling infrastructure to the same degree as our approach. We consider our approach closer to the everyday realities of urban utility cycling.

The night-time public transport travel times reported in our data set divert considerably from the ones computed by the Google Directions API, they are on average 66 minutes longer. This can be explained by the differences in the routing logic. While r5py selects only connections departing within the specified hour even when they include a long transfer time during the trip, Google Directions, on the other hand, tend to postpone the departure time until a more optimal connection is found, which results in shorter journey times at nighttime when public transportation connections are scarce. The latter logic underreports waiting time at the beginning of a trip (departure is shifted outside of the prescribed departure window) and is undesired in the light of reproducible and repeatable results that can be compared across times of the day, across modes of transport, and across different places.

The estimated car travel times appear slightly longer in our dataset at all times of the day (5–7 minutes, on average), even after we added the parking penalties to the Google Directions data for comparability. As for all other modes, our snapping approach adds the time needed to reach the closest road segment from any origin or destination point, walking at an average speed. Google Directions, on the contrary, do not account for these extra times. Rather, its algorithms find the closest network edge that can be routed for car traffic, while our approach snaps to the walkable network, adding additional access time to drivable street segments. Overall, our evaluations showed that the travel times we computed represent realistic values.

While some of the modes at certain times of the day show differences from the travel times reported by the Google Directions API, we are confident that our data is complete and correct and can be used in analyses and planning.

Usage Notes

The data supports many potential use cases. The data set can be used as input data for complex computations, or for quick exploratory visualisations, such as the examples below. The pre-computed nature of our data set allows fast and low-threshold exploration of the complex patterns of urban transport. Below, we describe an exemplary set of use cases:

Comparing accessibility landscapes by travel mode

Comparing accessibility landscapes between neighbourhoods and travel modes enables analysing functional structures of the region and the integration between land use and transport systems (Fig. 2). Is location A more accessible by private car, public transport, cycling or walking? Where are the most accessible grid squares for different travel modes located at? The underlying 250 × 250-meter YKR grid further would allow to link travel times with socio-demographic data and to calculate cumulative population counts for every grid square in addition to travel times.

Fig. 2
figure 2

Comparing travel times by mode of transport: the panels show the travel time to reach the main railway station in Helsinki city centre by (a) walking, (b) public transport, (c) cycling, and (d) car. The travel times are in minutes. Data: the authors (travel time matrix), Helsinki Region Infoshare (borders, sea).

Assessing the impact of changes to transport networks

Another typical use case is the mapping of the accessibility impact of major transport projects by comparing changes in travel times between years. For example, in the Helsinki Metropolitan Area, an extension to the existing metro line was opened in December 2022, which led to changes in public transport travel times. By comparing travel times before and after the extension in a simple map visualisation of the data set presented here, we can capture the magnitude and spatio-temporal variation of these changes (Fig. 3).

Fig. 3
figure 3

Change in public transport travel times to the central railway station across the Helsinki metropolitan area between 2018 and 2023. Blue shades indicate shorter travel times (faster connections), red shades an increase in travel time (slower connections). The metro line extension, opened in December 2022, serves the South-Western areas of the region. Data: the authors (travel times), Helsinki Region Infoshare (topographic data).

The 15-minute city in the light of different personal capabilities and different conditions of the physical environment

The ability of the data to consider temporal dynamism and varying abilities of people and street conditions allows to compare how different temporal and traveller assumptions impact accessibility landscapes. For example, we can compare how various assumptions affect the 15-minute service availability by walking, a heavily debated topic in urban planning (Fig. 4). Varying the walking speeds from values representative for average adults in dry street conditions (average walking speed) to values representative for older adults in winter conditions (slow walking) allows to estimate how equal service access is for different population groups.

Fig. 4
figure 4

Accessibility modelling for varying personal capabilities and external restrictions (modified from Willberg et al.19). The panels show the differences in model outcomes if assumption are changed only slightly: Model 1, the baseline model, shows the cumulative share of population that can reach a grocery store within 10, 15, or 20 minutes walking, for an average walking speed, in dry road conditions, and at 5 pm in the afternoon. For models 2 through 4, one parameter each is changed to reflect the walking speed of an older person, the restrictions imposed by slippery road conditions, and the reduced service network density at 6 am in the morning. Models 5-6, then, evaluate combinations of two of these factors, model 8 the impact all three factors combined have on model outcome. Only 34% of older people can reach a grocery store within 15 minutes walking in winter conditions in the early morning (model 8), as opposed to 93% in the baseline model (model 1).