Lessons from nine urban areas using data to drive local sustainable development

Cities are central to achieving the 2030 Agenda for Sustainable Development, yet many remain disengaged from the process. This paper examines nine city and subnational pilot projects conducted between 2015 and 2019, to assess the role of a data-based approach to governance in improving decision-making for sustainable development. We consider to what extent a data-based approach to governance can help cities in (a) localizing the sustainable development goals (SDGs); (b) integrating national and local datasets; (c) increasing openness of data and enabling better accountability for outcomes; and (d) encouraging innovation in data gathering. We find that data is considered a useful entry point for local sustainable development conversations and employing a focus on data management is welcomed by nearly all local governments, however, few localities explicitly focus on data to improve their sustainable development outcomes. Critically, despite the excitement around big data, most cities are unable to access them due to capacity challenges, data ownership, and privacy concerns.


INTRODUCTION
With more than 60% people living in cities by 2030 1 , the importance of cities for achieving the well-being of people and planet was well-recognized in the 2030 Agenda for Sustainable Development, particularly through the inclusion of a stand-alone goal on cities (SDG 11). It has been widely acknowledged that taking a place-based approach, with a particular focus on urban areas, will be crucial for the success of all the 17 sustainable development goals (SDGs), not just SDG 11 [2][3][4][5][6][7][8] .
To make the global goals and targets relevant at the local level, a process of SDG 'localization' has emerged 2,3,6-15 , acknowledging that 'local-level indicators must be included in any future development framework, because local governments are the primary point of institutional contact for the majority of individuals' 16 . Over time, localization processes have taken on two purposes: to garner local governments' support in achieving the SDGs through action on the ground and to make the SDGs a useful framework for local sustainable development policy 17 .
Yet, more than five-plus years on from the international agreement on the 2030 Agenda, many cities remain unaware of the SDGs or resistant to adopt what they consider to be another onerous reporting system 18 . In some contexts, where there is no national impetus to incentivize SDG adoption, or there are inadequate resources to fulfil everyday local-level priorities, the agenda faces institutional and political resistance 19,20 . If the Agenda is perceived by local agencies as just a compliance exercise, sustainable development is unlikely to be achieved at scale by 2030. There is an urgent need to reimagine the approach to SDG localization to garner more political support and encourage adoption.
While there may be other useful approaches that can lead to successful localization 13,21 , one good place to start is by recognizing local governments' needs and current capabilities for delivering local-level development policies. Policy discussions at the global and local level by institutions such as the United Nations and the United Cities and Local Governments, have identified a functioning local data system as a crucial ingredient for effective local decision-making [22][23][24][25] . Within the SDG agreement itself, The 2030 Agenda, data is also recognized as a crucial means of implementation 23 . As recognized in paragraph 48 of the agreement 'Quality, accessible, timely and reliable disaggregated data will be needed to help with the measurement of progress and to ensure that no one is left behind. Such data is key to decisionmaking.' 23 The assumption of Heads of State and Government who endorsed the agreement was that more timely and relevant information on key sustainable development outcomes would help guide policy-decisions and ensure interventions and investments were carefully targeted to reach the most disadvantaged. Whilst the efficacy of data and evidence-based policy-making is a broad area of academic discussion, the scope of this paper prohibits a summary of this debate. Suffice only to say that we concur with the proponents of evidence-informed policy, such as Cairney and Oliver (2017), in assuming that more timely and accurate information at the city-scale-here referred to as datawill provide policy-makers with the tools to make more impactful decisions 26 .
Many local governments around the world, such as Amsterdam, Vishakapatnam, Phoenix, or Mexico City, have also demonstrated their commitment to data-informed decision-making by adopting e-governance methods or by participating in 'smart city' programs. Such programs aim to improve technological capacities and increase data availability to support effective decision-making, but more importantly, they showcase governments' willingness to innovate and undergo institutional changes 27 .
In this paper we therefore hypothesize that a data-based approach to the governance of local sustainable development, which aims to improve the quality and utility of local data on sustainable development outcomes and harness the opportunities afforded by the data revolution, offers a promise of more targeted, impactful action towards local sustainable development outcomes. To test this hypothesis, and to reflect on the ability of subnational and urban areas to use data in support of SDG localization, we examine the results of nine action-research pilot studies supported by the Sustainable Development Solutions Network (SDSN). Conducted between 2015 and 2019 by the SDSN Thematic Research Network on Data and Statistics (TReNDS) and Cities Thematic Network, these pilot projects took place in a variety of geographies and developmental contexts. They include three US cities from the United States' Sustainable Cities Initiative (US-SCI) (Baltimore, New York, and San Jose) and six urban and subnational regions from the Local Data Action Solutions Initiative (LDA-SI) (City of Patiala in India; Metropolitan regions of Belo Horizonte, Rio de Janeiro, and São Paulo in Brazil; Los Angeles in the U.S.; a Columbian City Network representing 16 cities; Bristol in the U.K.; and Aruba representing a Small Island Development State). The US-SCI aimed to support a sample of US cities and their key stakeholders to familiarize themselves with, customize, and adopt the SDGs sub-nationally, as well as identifying tools that might enable them to progress towards the goals. The latter, LDA-SCI was more squarely focused on data and the SDGs, investigating different ways local governments were using data and information to help advance their sustainable development objectives. More details on these projects and the pilot assessments are presented in the methods section at the end.
In 2013, at the outset of deliberations on the SDGs, the High-Level Panel of Eminent Persons on the Post-2015 Development Agenda called for a 'data revolution' to facilitate the move towards sustainable development 28 . Data revolution describes the emerging transformations stemming from the increase in the volume, speed, and types of data that are now available. In the following year, the UN Secretary General's Independent Expert Advisory Group on Data Revolution for Sustainable Development (IEAG) issued recommendations on how the data revolution could be mobilized for sustainable development 24 . The IEAG acknowledged how the community of data producers, the means of data collection and dissemination, and the topics covered by data were all expanding, alongside the growing demand for better data from different parts of society.
However, this data revolution has so far been imagined at the national and international levels. While the Millennium Development Goals (MDGs) were lauded for strengthening the national statistical capacity of member countries, and for improving statistical methodologies and information systems internationally [29][30][31][32][33] , local and sub-national data monitoring capacities fell outside their scope. Even though the IEAG called for strengthening statistical systems at the 'local, national, and international level during preparations for the SDGs' 24 , their recommendations did not include specific measures that could help achieve this vision for local governments. The methods for assessing the financial implications and funding options related to the data revolution are still focused at the national and international levels 32,34,35 . Although others, including many UN Agencies like the United Nations Development Program, UN-Habitat, urban think tanks, and academics, as well as some local governments, are increasingly emphasizing the data revolution's importance at the subnational and urban levels 7,17,[36][37][38][39][40] .
Meanwhile, local governments continue to face several issues in taking advantage of this data revolution. Across both developing and developed countries, local governments lack the requisite financial resources to generate locally relevant data, build statistical systems, and foster capacity and skills 41,42 . Without necessary investments in local data systems, local governments will struggle to monitor progress on urban SDG targets and broader local sustainable development objectives.
A critical issue found with data is its relevance for policy-making. The data available for development policy are often inputoriented rather than outcome-oriented, meaning they quantify basic interventions (like the number of toilets built instead of the resulting human impacts, like reductions in open defecation) 43 . Data is often not disaggregated to account for vulnerable population groups, making it impossible to calculate important indicators, like maternal mortality amongst women of color 44 . Relatedly, a lack of spatial data makes it challenging to examine inequalities between neighbourhoods 45 . There are other challenges with aggregating data due to varying production methods and underlying definitions, such as different meanings of 'urban' areas 42 . Even when available, they are dated or for single time periods which are not useful to establish current conditions or trends. Designing policy with data that are not relevant can risk spending limited resources on a project that will not lead to the desired impact 46 . For instance, a city might have readily available data on parking spaces, but if the overall objective is to encourage citizens to use more sustainable modes of transportation, measuring the ride-share component of daily commutes is likely to be more impactful. The intended outcomes need to drive the demand for data to help address problems with appropriate information.
Another issue covered in the literature is the interoperability 39 of data across different sources, formats, and methodologies. Although relevant data may exist, it may be inaccessible to policymakers at different levels or across departments, or difficult to integrate into official statistics. Reasons for this include data ownership barriers between public and private sectors 36 ; limitations of functional jurisdiction between government departments; or simply because of the form in which data were recorded 22 . Data reporting remains largely non-standardized 12 , which limits policymakers' ability to aggregate and recognize patterns at the local, national, or international levels. At the same time, standardizing sets of indicators, data, and/or methods across the diverse and complex urban realities 2 can limit its utility for localized development planning 42 .
Despite these issues, and if addressed head on, the data revolution could offer many opportunities for cities. Apart from traditional data sources (e.g. surveys), innovative methods of generating data (e.g. participatory enumeration processes) at the neighborhood and urban levels are offering insights to previously 'intractable' problems 47,48 . Technologies, such as mobile phones and satellite data 49,50 , are offering volumes and varieties of data to help inform policy, transform society, and protect the environment. Whilst efforts to harmonize data across communities can enable lesson sharing, collaborative action, and the pursuit of common national and subnational goals. Ultimately, data can improve outcomes for all people and the entire planet so that no one 51 , place, or ecosystem gets left behind from the sustainable development agenda. Because local governments are at the forefront of achieving this agenda, they need to be part of the ongoing data revolution. While existing literature demonstrates that SDG localization is critical for both local and global sustainable development (see Table 1), the data revolution is seldom addressed other than at the national and global scales. With this paper, we argue that the active localization of the data revolution can play a significant role at the sub-national level in achieving local sustainable development goals, as well as contributing to the global ambitions of sustainable development (see Fig. 1).
In our analysis, we ask: To what extent can data-based approaches improve evidence-based policy at the local level and increase attention towards local and global sustainable development? We aim to identify aspects of the data revolution that are proving useful, as well as data-based innovations that cities are undertaking to overcome challenges they may encounter to achieve their sustainable development objectives. The following sub-questions have guided our analysis: 1. Does more and better data improve the focus on local sustainable development outcomes, especially those in line with the global sustainable development objectives? We study the utilization of data through the local governments' abilities to (a) foster local, evidence-based policy-making for sustainable development, and (b) integrate national data at the local level. 2. Does increasing accessibility of data inform and empower people to engage more with the local government, in turn affecting their development outcomes? We study the changes in data accessibility and openness for empowering people through the local governments' use of (c) data platforms and communication. 3. Are local governments able to harness latest technologies offered by the data revolution to improve the availability of relevant data and information? We explore the improvements in data availability through (d) innovation in data gathering. 4. Are local governments able to afford these processes and make them sustainable in the long-term? We explore the viability of the process through (e) cost of implementation, and (f) sustainability of the processes adopted.
The key objectives of these pilots were to understand: (A) what indicators to select and why, (B) how to link local and national reporting systems, (C) how to build data visualization platforms, and (D) how to utilize third-party data. Findings on each of these is aligned with the research sub-questions (a), (b), (c) and (d) respectively. Research questions (e) and (f) were added to this subsequent analysis to understand how feasible and sustainable these efforts are. Where processes were ongoing at the time of this research (2020) (1-4 years since the pilots were conducted) we consider this a strong indication of their long-term viability. Table 1 and Fig. 1 lay out the various strands of this discussion and show where there is existing literature, as well as where and how this paper intends to contribute.

RESULTS
Using a data-based approach facilitates localization of the SDGs To initiate the localization exercise, all participating cities (or key partner institutions within those cities) first mapped the global SDGs with their existing city priorities, plans, and functional/ jurisdictional limits. The intention was to foster more political buyin and ownership of the SDG localization process. For example,  Table 1), this paper focuses on links 4 and 5, demonstrating the two-pronged roles of the data revolution: to enable the SDGs to provide a framework for local sustainable development policy, and to drive local governments' targeted focus for local sustainable development outcomes. Note: We use the term 'local development priorities/policies' to refer to the existing local development policies (which may or may not be comprehensive or aimed towards sustainable development). It is when they are either explicitly developed keeping sustainable development as an objective or are aligned with the SDG framework, that they are referred to as local sustainable development policies.
Patiala mapped the SDGs according to the 18 municipal functions mandated by the Constitution and existing city priorities so that officials could see a practical alignment of the global goals with their duties and responsibilities. In doing so, they were able to recognize the importance of monitoring many SDG targets at the city-level (e.g. air pollution, drinking water access), but they also found that some key city priorities were missing within the SDG framework (e.g. parking management, stray animal welfare). This approach helped them develop a strategic processstarting with identifying what information the cities already had; systematically examining and coordinating existing plans and data resources; and capitalizing on political priorities, financing mechanisms, human talent, and timelines to set achievable targets.
The SDG framework also offered cities the opportunity to more holistically examine their planning approaches and see what might be missing. New York City (NYC), for instance, added equity as a key dimension to their environment, economy, and infrastructure priorities in their OneNYC plan. In the context of Small Island Development Countries, Aruba used the SDG framework to make a stronger argument for connections to be made between ecosystem services, the tourism-economy, and overall quality of life and well-being. In this, the Agenda's integrated social, economic, and environmental objectives encouraged cities to make progress on multiple dimensions without compromising other objectives.
The cities also used the SDG framework in several other creative ways to support local priorities. Baltimore and Patiala found the framework useful as it provided a common language to discuss sustainability across different sectors and with different stakeholders. Colombia used the framework as a communications and fundraising tool in order to gain broader support from across sectors, such as finance, and attract more financial resources for investing in sustainable development.
Instead of forecasting various outcomes based on past trends, the pilot cities also started employing a method of 'backcasting' 52 , which began with projecting desired outcomes and then working backward to understand what would be needed to realize these goals. In cases where local-level plans and visions already existed, they often lacked measurable goals. Additionally, even when some of the pre-existing plans were associated with quantitatively measurable targets, very little progress had been made. In the case of San Jose, a 50% reduction in per capita energy use by 2020 was established in their Green Vision 2007, but had not resulted in much change. An examination of whether the established targets were realistic was therefore convened, which involved the participation and subsequent buy-in of key stakeholders. In nearly all cases, the process of aligning the locallevel priorities with the framework helped city officials, citizens, and other stakeholders establish a common long-term vision, their desired impact, starting points, and assess their technical capacity for the required data and monitoring that would accompany the goals.
In addition to using indicators that helped diagnose issues or monitor results, some pilots included indicators that could help evaluate the processes and systems that would be critical for achieving the goals. For instance, Brazilian cities São Paolo, Belo Horizonte, and Rio de Janeiro developed indicators that could evaluate local processes, including data system development, existence and effectiveness of local public plans, policies, and participatory councils for housing, risk management, accessibility, public spaces, and green areas. These eventually led to the establishment of local data observatories in these three cities. This showcases the commitment from the cities to invest in long-term viable processes to monitor and achieve sustainable development, beyond the lifetime of the pilot. Similarly emerging urban observatories 53,54 in many other cities, including Newcastle 55,56 and Phoenix 57 strengthens the finding that cities are indeed making innovations and committing to substantial institutional changes in order to enable better and more effective governance approaches.
Finally, all of the cities used the SDG dialogue to identify underlying drivers of sustainable development challenges. The process highlighted the limitations and necessity for supplementary sustainable development measures, as most cities had to add targets or indicators specific to their context. For example, Baltimore ascertained that liquid asset poverty was a strong indication that a household may not be resilient to shocks and stresses, such as layoffs, family illness, or property damage, and therefore included it as a key indicator for Goal 1 on poverty reduction. Los Angeles recognized the criticality of disaggregated data as a means to address multiple issues: Goal 3, maternal mortality, was monitored across ethnic groups because their experience and preliminary data-based studies suggested that despite an overall low rate of maternal mortality, African American mothers, regardless of socio-economic status or education levels, were three to four times more likely to die during childbirth. Goal 4 was also changed to 'education for all children' because not all children identified themselves in the binary gender and this disaggregation helped set more aggressive targets than those directed by the official framework. While primarily focused on SDG 11, Metropolitan SDG Observatory (METRODS) at Belo Horizonte, identified a set of 55 indicators that would address the diversity of conditions within one metropolitan region. For Patiala, although parking management was a core municipal issue, the selected indicator was focused on promoting public transport, cycling, and walking. NYC also used a more city-specific measure of poverty, considering the high cost of living and affordability. These findings are summarized in Fig. 2.
Integrating local with national data is challenging but achievable Using data for improving local-level decision-making has gained traction, but its integration with national-level monitoring poses additional demands on cities that may already be overburdened. These challenges can be understood through the two broad steps of national-local integration: (i) reporting from subnational to the national level and (ii) aggregation and comparison across multiple subnational areas.
To achieve these two processes cities would be required to collect annual standardized data between 2015 and 2030 and have this data available at a census tract or neighborhood-level. As many cities and municipalities often have limited to no individual statistical capacities, this reporting can seem onerous. Furthermore, with limited financial resources to expand these data collection capacities at the local-level, fulfilling this requirement can be difficult unless alternative sources of data can be identified.
To enable aggregation and comparison across multiple subnational or local areas requires all subnational entities to use indicators and methodologies that are consistent, either with the official SDG indicator framework or at least across the national level. However, the choice of targets and indicators across cities, like Los Angeles and San Jose, varied from those (i) that matched with the global set; (ii) that were in-line with the global indicators, but with slight variations; (iii) with different indicators, but which were still relevant proxies for some SDGs; to (iv) Different targets and indicators, although relevant at the local level, but with no provision in the SDG framework. While types (i) and (ii) could be compared across regions, types (iii) and (iv) are primarily useful for local decision-making. Nevertheless, a potential middle ground, of balancing the two seemingly conflicting functions of local-level decision-making and aggregate level reporting, emerged from both the Colombian and Brazilian cities.
Nationally, Colombia is considered a leader in SDG implementation, but at the subnational level uptake has been limited. The pilot, involving 16-member cities from the Cómo Vamos Cities Network (CVCN) in partnership with the Fundación Corona and Columbia University, developed data tools that could be adapted to the varying contexts, including a common list of city-level SDG targets, data indicators, and an open data platform. A common set of indicators were developed, ranging from nationally-defined indicators to local-level indicators. After an initial desk-based evaluation of the indicators, an expert evaluation and stakeholder consultation were used to help identify a set of priority indicators. A 'Benchmark Target Framework' (setting ranges instead of a fixed value to achieve targets set for each indicator) was used to help the cities set their target thresholds to ensure that they were locally relevant, and ambitious while supporting the government's national targets. The framework was also used to drive some intracity competition, to stimulate change and demonstrate how cities in a single country could use common data frameworks to benefit both their constituencies and support national government initiatives.
Unlike Colombia, no official national SDG framework was established in Brazil due to an economic and political crisis in 2016. Moreover, there was no engagement between civil society organizations and the national government, and the data presented in the government's Voluntary National Review (VNRs) at the High-Level Policy Forum (HLPF) in New York was considered not representative of CSO voices and of the countries' ongoing economic crisis 58,59 . It was in this context that a number of Brazilian municipal authorities recognized the need to focus on improved data collection. This political situation led the Metropolitan agencies and civil society groups to come together and develop a Metropolitan Region-level SDG framework that established a minimum number of common indicators for all metropolitan areas to use. An index was developed that intended to establish a high standard for shared measurement and crosscomparison between locations.
A common challenge faced by most of the pilot locations relates to institutional capacity. In 7 of the 9 locations, there was no single legitimate official institution capable of compiling and organizing the data across scale and institutions. To deal with this, the Aruban government passed a Ministerial Decree institutionalizing the National SDG Commission and the SDG Aruba Indicator Working Group (AIWG) that released a baseline of 230+ Global Monitoring Indicators in 2018. Simultaneously, it initiated the process of localizing the SDGs by defining a set of context-specific targets and indicators through its SDG Roadmap. This helped anchor the SDGs in a national framework, but also created responsibilities for local governments to help deliver on nationally-defined reporting requirements.
Many lessons emerged from these cases that may be useful for overcoming the challenges of developing integrated national and local datasets. For example, when subnational stakeholders can see a political value for national and international reporting, they are more willing to implement it within their jurisdiction, for example to ensure coordinated national action on sustainable development in the absence of any national policy, as in Brazil. Defining a wide national-level indicator set, through a collective process with an inclusive representation of local authorities and other stakeholders, with a minimum number of common acceptable indicators for reporting, could also help bridge the local needs of aggregation and comparability. These findings are summarized in Fig. 2.

Mixed-medium data communication is recommended
As part of the pilots, cities were required to focus on improving their data communication, to engage both policy-makers as well as the general public on issues of sustainability. The cities took varied approaches to do this, contingent on available technical capacities, legal frameworks, and data availability, different types of data (e.g. qualitative, quantitative, spatial) and stakeholders (experts as well as those with low-data literacy across policymakers, community members, researchers, and practitioners) which all demanded a combination of different modes of communication. Three predominant modes used by the pilot projects were the publication of a physical book (e.g. NYC) or annual report, digital platforms or dashboards such as websites (e.g. Baltimore, NYC, San Jose), and public consultations (e.g. San Jose, NYC, Brazil). Each was designed to be made accessible by non-experts, largely with the use of data visualization techniques, without losing the rigor, detail, and complexity of the information conveyed. This was done to enable users to navigate, understand, engage with, and monitor the process, and provide feedback on the implementation.  Printed versions were found to be useful in documenting progress made at a particular time and were easily adopted by a variety of stakeholders as a medium they were conversant with. But this medium was costly to scale and did not allow flexibility in analyzing the data as per users' needs. On the other hand, digital platforms offered a specific advantage by enabling aggregation and comparison of SDG progress made by multiple cities, to maximize integration and scalability. For example, the Colombian pilot used a dashboard that enabled the users to compare the progress of a particular city with other cities in a year, observe historic progress made by a city over the years for specific targets and indicators, and learn about the progress made at a sub-city level using a heat map. Consultations with the public and officials offered an additional opportunity for user testing, evaluating the marketplace of decision-making tools (such as ArcGIS and Esri Story Maps), as well as receiving public feedback on priority actions. While all modes had their limitations, they also offered advantages, justifying the use of a mixed medium approach for maximizing data transparency and communication, contingent on resources available.
The form in which data was shared was another crucial communications consideration. While sharing raw data was seen as most transparent and would enable data-literate actors in the community to use data for custom purposes, analyzed information provided quick insights on SDG progress to the otherwise uninitiated. It was also perceived as imperative to make the raw data available in an open format, with no legal encumbrances on use and reuse, and in widely accepted, non-proprietary (which also reduced costs of acquiring proprietary software), platform-independent, and machine-readable formats. These were the ideal standards set out by most cities, although only four were able to achieve this due to data availability constraints (discussed in section (d)).
In a bid to engage the general public on sustainable development and to show transparency of local government actions, many of the pilots involved efforts to open up subnational datasets. However, this demanded varying levels of discussion on data ethics. To ensure privacy, all data needed to be aggregated in a way that could not be traced back to a unit or household-level, while also providing sufficient disaggregation and granularity that would be useful for policymakers. There was also a distinction made in Baltimore between protected data and sensitive information 60 . Protected data involved an agency that would need to authorize access according to local laws. Information that could raise privacy and security concerns, or jeopardize public health, safety, or welfare to an extent that would exceed the potential benefits of publishing the data, was identified as sensitive information, and not made publicly available.
The benefits of open data for designing more effective integrated policy on sustainable development were highlighted in a number of instances, such as in New York with the redevelopment of the Hunts Point area 61 . However, the assumption that opening up this data would motivate citizen action (and this in turn would change government behavior) on sustainable development was not explicitly demonstrated within the study period. These findings are summarized in Fig. 2.
Open data laws can reduce data availability challenges Almost all the pilots, irrespective of their income-levels, faced challenges with gathering the relevant data needed for some or most indicators, especially within the limited period of study, which was 6 to 12 months. The most common data source used by subnational agencies was the national census, but it posed limitations on the data's granularity, disaggregation, and sub-city or neighborhood-level spatial attribution. To reduce their dependence on national data, which was considered not representative and out-of-date, the pilots in Brazil heavily relied on locally generated data as an alternate data source. This included data collected by civil society organizations and data made available by local departments. However, municipal records were often found to be scattered across different departments, and the sharing of data between functional jurisdictions was limited. Some pilots also explored the use of crowd-sourced or citizen-generated data, which enabled the public to provide data directly to the city, although the sample sizes and technical rigor of the data remained questionable to be used for regular reporting.
A number of the cities explored the use of technologies and big data sources to improve their data collection, including using remote-sensing data and web scraping programs. For example, Baltimore noted that while there were critical issues related to lowincome households suffering evictions due to a poor justice system, not enough data was available to effectively monitor their justice indicator. To address this, they relied on a study linking housing instability and the justice system, and identified rent control statistics as an alternate indicator to monitor. They employed an automated web scraping program called the Client Legal Utility Engine (CLUE), developed by Maryland Volunteer Lawyers Service (MVLS) and the Baltimore Legal Hackers Group, to pull daily data from the Maryland Judiciary system. By spatially mapping the cases, it indicated that the greatest number of evictions were occurring in predominantly lowincome and minority households. San Jose also explored innovative data collection methods via hackathons and engaging the techsector employees to help gather more civic data. In addition, they also explored partnerships with technology companies in the neighboring Silicon Valley area to develop a neighborhood dashboard using big data. Meanwhile, NYC used sensors in retrofitted telephone booths to collect real-time environmental and traffic data. In this regard, spatial data sources, such as OpenStreetMaps, were found to be useful and acceptable. However, despite the general enthusiasm at the beginning of the process to incorporate big data and non-traditional data sources, all cities faced difficulties with gathering data from private-sector agencies and other non-public sources. The Colombian Cities network categorically identified the lack of a local-level platform to catalyze synergies between private, public, and civil society as an issue. In particular, the process of establishing legal data-sharing agreements (DSAs) with these entities was especially burdensome. Within the DSA process, the parties confronted issues around data ownership, use, risk, privacy, form (raw vs. analyzed results), methods, costs, and long-term sustainability. In addition to these barriers, some cities' governments preferred to utilize 'official' sources that were already connected with the city planning and reporting systems.
Data management was another challenge for many of the pilot cities and regions involved. Effective management requires storing and processing queries on large amounts of data across several datasets in different forms and scales. Setting standard protocols and methods for storing different forms of data was viewed as an important resource investment. When cities were unable to manage and maintain the data dashboards on their own, cities like San Jose, partnered with a local university that could hold the 'data commons' 62 . This ensured confidentiality, helped to avoid partisan issues, and would enable the regular update of the data without it being affected by political change in municipal offices. For the overall data system management, Baltimore and NYC established a Chief Data Officer position to liaise with a designated lead data coordinator within each key department, thereby ensuring data could be updated and shared regularly, was errorfree, and followed rigorous data collection methodologies.
To respond to data availability challenges, cities like Baltimore have established an Open Data Policy, that aims to make public data available as open source. NYC's 'Open Data Law' mandates all public data to be made freely available on single web-based portal. Over the years, the law has been amended to include stronger data dictionaries with unified definitions of different terms, retention guidelines, spatial standards, and response timelines for public requests, to ensure the city's transparency around open data. Implementing these policies required strong leadership at the city-levels to ensure various entities across multiple sectors and departments worked together in a sustainable way. These findings are summarized in Fig. 2.
Local statistical capacities can be maximized by partnerships Collecting and managing local level data has cost and technical capacity implications, that not all cities, especially those in the Global South, are capable of incurring. Many cities traditionally are not responsible for data collection, and often have limited statistical capacities. However, the pilots highlight a few of the strategies cities can take to avoid or share the cost burdens of developing robust local data systems. There may be some recurring costs associated with collecting locally relevant data that may be budgeted for under sectoral departments. Costs may also be associated with setting up a data storage system, however, these may only be a one-time cost. There may be a fixed cost associated with a full-time statistician or a data scientist, who can collate metadata and analyze the patterns on regular basis. Some platform development expertise may be required for a short period and can be outsourced for a one-time cost along with some investment in training key staff to use it. It was noted that data management and updates required minimal time for a staff member, whereas outreach, training, and enhancement of the platform could take up a fourth of their time. To begin with, these could be assigned to a team of existing staff members. Crowdsourced data collection platforms and the use of technology with the use of sensors were explored to reduce recurring data collection costs. Yet, all cities, including big cities like NYC, found budget constraints for robust data monitoring. More detailed methodologies are required to assess the costs associated with data management systems at the sub-national level, especially owing to a large diversity in their sizes and contexts.
The alternative to building in-house capacity, that almost all (8 out of 9) pilots found useful, was establishing partnerships with academic institutions, who offered technical skills, helped with the data collection and its analysis, conducted regular staff training, and acted as long-term knowledge partners. Since many of these institutions also found value in helping the local governments to advance their own research interests, the partnerships turned out to be mutually beneficial.
Partnerships with universities and other local institutions also helped institutionalize roles and garner monetary funds to support the processes. In Bristol, partnerships were established with the University of Bristol, the City Council, and two networks -the Bristol Green Capital Partnership of 800 organizations and established Bristol SDG Alliance of 70+ stakeholders. A full-time 12-month position (an SDG Research and Engagement Associate) was created and funded by the university's Strategic Research Fund. Additional funding to extend the post for six months to produce a Voluntary Local Review (VLR) was subsequently secured from the University's Economic and Social Research Council Impact Acceleration Account. As the Associate was seconded from the City Council, they acted as an embedded advocate and coordinator for the SDGs. Although, the contractual agreements between the institutions may also have a small financial implication. These findings are summarized in Fig. 2.
Long-term sustainability of the processes is important to ensure the achievement of the goals The pilots studies took place over 6 to 12 months and therefore only provided a snapshot of the local processes underway. Yet, throughout these pilots, other outcomes were observed that may help ensure the sustainability of these localization processes in the longer-term, and in turn, help cities achieve sustainable development outcomes for themselves and the world at large.
Throughout the pilots, several public convenings were held to involve communities, local organizations, experts, and officials.
Their intention was to discuss and identify indicators that reflected community concerns, get community help in stocktaking exercises, and share data and information on the state of the region's sustainable development. This helped sow the seeds of a community of practice to coordinate numerous local sustainable development initiatives that could promote SDG achievement over the long term. Consultation fatigue was reduced by aligning with existing initiatives (e.g. Baltimore's 'Listening to the Listening approach') and recognizing the limitations poorer households and under-resourced organizations may face in freely offering their time, efforts were taken to ensure barriers to participation were reduced.
Project teams leveraged existing networks of civil society, private sector, and citizen groups. Baltimore 2030, which had come up as a network of different institutions as a response to the civil unrest in the city in 2015, shared the common aim of creating a long-term vision, sharing knowledge, and developing action steps to achieve the community's needs and aspirations. Meanwhile, institutions also emerged through the pilots. In Brazilian the Metropolitan SDG Observatories were formed. METRODS was a network of institutions and organizations from civil society, the public and private sectors, and academia, with the overall objective to identify, disseminate, and monitor indicators related to SDG 11 in the major metropolitan regions of Brazil, as well as raise funds and build local legitimacy in the data gathered and reported.
Non-government knowledge partners, with a history of working with city authorities and residents on urban policy issues, supplemented a much-needed technical capacity to kick-off and maintain the SDG initiatives. Their engagement helped to build political interest, engage student communities, provide legitimacy, and building long-term institutional memory, for example, Stanford University's Sustainable Urban Systems Initiative in the Department of Civil & Environmental Engineering partnered with the City of San Jose to help localize the SDGs through initiatives like the Environmental Sustainability Plan. Student teams supported this work through project-based learning courses, in partnership with the Office of the Mayor and Environmental Services Department.
Processes were also institutionalized within the municipal systems to ensure longevity of the effort. The existing Baltimore's Office of Sustainability in Baltimore, and the Mayor's Office, and the Environmental Services Department that houses the Manager for Sustainability and Compliance in San Jose, were identified as the leads for SDG monitoring. Integrating the SDGs with existing environmental portfolios helped ensure alignment with existing city plans and initiatives.
Overall, in all of the pilot cities, general awareness amongst local government officials on the SDGs and their relevance subnationally improved. In Brazil, through a Knowledge Questionnaire survey conducted with city officials of 22 municipalities, it was found that while 59  These findings are summarized in Fig. 2.

DISCUSSION
At the outset of this research, we hypothesized that a data-based approach to governance at the local level could help governments to achieve local sustainable development objectives and in doing so support their progress towards the global SDGs. The pilots demonstrated a genuine enthusiasm for the pursuit of sustainable development at the local level and a parallel interest in data and technology to support local governance. Capitalizing on local actors' interest in data provided a useful hook to initiate conversation on sustainable development goals and priorities.
Data also proved a useful angle through which to engage partners in the sustainable development discussion.
With regards to the actual use of data across departments and amongst different stakeholders, a number of common challenges were identified which hindered governments' ability to compile and share standardized sustainable development data. The pilots highlighted a range of enterprising strategies to overcome these challenges; for instance, a lack of technical capacity was often resolved through partnerships, data sharing between different departments was facilitated by open data laws, compiling data across multiple overlapping jurisdictions was eased by the use of spatial data, whilst latest technologies (such as sensor data) enabled the collection of data for unrepresented issues. But some challenges persist, even in developed economies, pointing to issues that lie beyond financial or technological constraints.
All of the pilots were required to identify indicators that were locally relevant and aligned with the SDGs. In doing so, all of them were able to bring greater specificity to monitoring systems by disaggregating data requirements, making them more inclusive and representative of the most vulnerable groups. This also helped make existing, local plans and programs more targetoriented. While most pilots (7 out of 9) managed to use data effectively to document baselines, collecting annual data for all indicators was a common challenge, even for wealthy cities like New York and Los Angeles. On the other hand, in cases where innovative data monitoring systems (e.g. urban observatories) were set up as part of this process, much baseline data was missing, questioning their immediate utility for policy-makers.
The process of localizing the SDGs and identifying contextually relevant indicators was found to have many benefits, but it also raised challenges for reporting to the national level. While all pilots effectively adapted the global SDG targets and indicators to locallevel priorities, the hyper-local nature of these datasets was inadequate for aggregating at the national level, or for comparisons across different urban areas. Localization may result in more isolation as there is an inherent lack of adaptability and comparability, even though it may offer more appeal to the local community. The pilots have also cautioned against attempting to develop standard city-scale relevant indicators that all cities would pursue, rather they found the power is in applying the framework holistically and prioritizing issues through a consensus-building exercise. A 'healthy tension between comparability across cities, which helps spur innovation and share lessons, and customization to their local realities, which enables internal coordination and performance management' is supported 5 .
Some cities explored the use of technologies and innovative data sources, but such data collection was difficult and resourceintensive. The experiences underscore the unequal access to relevant technology between poor and wealthy countries and localities, as observed by the IEAG 24 . There was also caution raised by cities such as Baltimore that there could be inequity of technology access between neighborhoods at a sub-city level, and relying only on such means for data gathering could in itself exacerbate uneven representation.
Open data offers benefits such as increased transparency and innovation, but some regions continue to face obstacles to fully realizing this potential. Many innovative open data platforms and digital systems are now enabling the collection and sharing of different forms of raw data and analyzed information 63 . Laws at the city-level are emerging, although primarily in the Global North, to make government data more open to the public leading to greater transparency 64 . While open access data can enable governments, corporations, citizen groups, and researchers to monitor, experiment, question, and influence decision-making 24 to steer development outcomes in the right direction, barriers to its adoption across different ownerships, sectors, and technological capabilities persist 65 . Independent, not-for-profit institutions or crowd-sourced spatial data sources (such as OpenStreetMaps) do however show some promise, both for overcoming governance concerns with regards to open data and encouraging citizen's use of the information 66 .
With regards to data openness, while the longer duration pilots showed more promise in making their data open using online platforms, printed reports, and consultative exercises, the shorter pilots struggled with setting up such systems, irrespective of being from high or low-income contexts. Today, many innovative open data platforms and digital systems are enabling collection and sharing of different forms of raw data and analyzed information 63 . Laws at city-level are also emerging, although primarily in the Global North, to make government data more open to the public leading to greater transparency 64 .
While open access data can enable governments, corporations, citizen groups and researchers to monitor, experiment, question, and influence decision-making 24 to steer development outcomes in the right direction, barriers to its adoption, such as lack of trust and reliability, persisted 65 . The interoperability of data was also a major challenge across the board. Cities found challenges in compiling data that was privately held, kept in forms that were not machine readable, or produced according to definitions and methodologies that were sufficiently robust.
Despite the relatively small funding and technical support offered through the pilots, any support for improving datamanagement systems was welcomed across the cities, and overall increased the buy-in to incorporate the SDGs with the local development plans. In a way, this approach worked like a 'trojan horse' for SDG localization efforts. This supports our hypothesis that strengthening data systems and management can encourage political attention to and greater awareness of sustainable development.
While awareness of the SDGs appears to have generally improved at the local level, the experiences indicate that more will need to be done to improve the data readiness of local governments in smaller cities. Although in many cases, such as with Indian cities, the devolution of powers has given urban local bodies a larger role in planning and implementation, small and medium-sized cities continue to face a lack of fiscal autonomy and depend on the national and provincial governments to fulfill their functioning needs, including data collection and monitoring.

METHODS A review of pilot projects in nine urban areas
This paper is based on a review of action-research pilots conducted by the United Nations Sustainable Development Solutions Network (SDSN)'s Thematic Research Network on Data and Statistics (TReNDS) and the Cities Thematic Network. The review included project reports, briefs, and interviews with select project team members. The pilots were conducted over two stages.
The first stage, the United States Sustainable Cities Initiative (US-SCI), offered financial and technical support for 12 months periods across 2014-17 to three cities in the United States-Baltimore, New York, and San Jose. These cities were selected as grantees to represent three different size-classes across the US, and because of their interest in participating.
The second stage, the Local Data Action Sustainable Initiative (LDA-SI), provided six months of financial and technical support through 2018-19 based on an open call for applications to cities globally. The six subnational regions-the City of Patiala in India; Metropolitan regions of Belo Horizonte, Rio de Janeiro, and São Paulo in Brazil; Los Angeles in the U. S.; a Columbian City Network representing 16 cities; Bristol in the U.K.; and Aruba representing a Small Island Development State-were selected based on the rigor of their proposals and to ensure that there was diversity across regional, geographic and development contexts.
While the pilots are not directly comparable with one another or to other global subnational regions due to their specific socio-political, environmental, and economic complexities, collectively, they are meant to offer insights on both emerging innovations and the remaining challenges at the subnational level for the data revolution. Our study focused on these two pilot projects as the key source of information, since they closely G. Jain and J. Espey