Abstract
The purpose of the GLocal dataset is to enable research in international development that requires both global scope and local precision. Leveraging modern geospatial analysis tools, we process a diverse array of sources to provide researchers with a growing set of economic, demographic, ecological and socio-political variables for geographic units relevant to public policy. We provide separate data files for different levels of administrative and periodic aggregation, along with ad-hoc files with more detailed information on specific topics. In this data descriptor paper, we discuss both our data processing methodologies and validation pipelines, and provide a short case study to illustrate the research potential of the dataset. We also introduce a simple web app, glocal.streamlit.app, which offers a user-friendly interface for exploring and visualizing the dataset. Given the growing number of public and granular sources of relevance for international development research, we hope to continue adding features and expand the GLocal dataset in the future.
Similar content being viewed by others
Background & Summary
Research on international development often requires subnational comparisons across different countries. One such example is the development of comparative studies of regions that are unique within their own countries, but that share noticeable similarities with other regions outside their national borders. For example, in order to analyze patterns of development and extraction in the Colombian Amazon, the most adequate benchmarks may not be found in other Colombian regions, but in Amazon regions in other countries1. More broadly, research questions of global importance often depend on subnational variation to abstract from potential confounders. For instance, studies focusing on the developmental impacts of regional and ethnic favoritism often leverage local variation in relative affiliation to national leaders2,3. Finally, development researchers and policy analysts often find themselves lacking fundamental quantitative inputs in settings with underdeveloped statistical sources4. These problems compound as national statistical agencies do not coordinate the timing and methodologies with which they measure development outcomes of interest, forcing researchers to develop ad-hoc solutions that often rely on specialized geospatial analysis skills outside of the toolbox of most development researchers and analysts.
To address these constraints, we have developed a structured dataset that provides subnational economic, demographic, ecological and socio-political indicators for all countries in the World. This "GLocal” dataset leverages a diverse and growing array of granular sources, processing them with modern geospatial analysis tools to create aggregates at consistent geographic and temporal levels. Importantly, data are aggregated at different levels of subnational administrative boundaries, which are of special interest for the purpose of policy-making. The resulting datasets are immediately available for download and analysis. More specifically, we provide 14 files corresponding to the aggregations - 9 files for each combination of three administrative and three periodicity levels, 3 files with all variables annualized for convenience at each administrative level (monthly data aggregated to yearly, cross-section data assigned to the year of observation), and 2 ad-hoc files with detailed information on agro-ecological crop suitability and mineral deposits. We also provide some supporting data containing a codebook and some information related to the administrative areas, for convenience. Overall, the GLocal datasets provide internationally comparable and locally precise data on nighttime lights (VIIRS, DMSP and Harmonized)5,6,7,8,9, land coverage types10,11, population and population density12, protests and violent events13, temperature and rainfall14,15,16, terrain topology17,18, trade infrastructure19,20, deforestation21, agricultural output and suitability22, relationship to capital city and water bodies23,24, mineral deposits and gas flares25,26, road network density27, reach of telecommunications services28,29, clean energy potential30, carbon emissions and air quality31,32. While our hope is to further expand the set of comparable and precise features available through the GLocal dataset, we believe the current set should enable researchers working on some of the most consequential topics in economic development: Economic growth, urbanization, deforestation, infrastructure, conflict and climate change. Importantly, the final list of variables in the dataset was determined considering licensing restrictions and the international comparability of original data sources. For instance, licensing constraints prevented us from creating GLocal aggregations from the Armed Conflict Location and Event Data Project (ACLED)33. Moreover, comparability issues led us not to incorporate data aggregations of local statistical agencies, even if broadly available through systems like IPUMS International34.
The rest of this data descriptor continues as follows: First, we provide an exhaustive list of the inputs that we use for the creation of the dataset, and outline the specific geospatial aggregation routines performed for each type of dataset. Second, we provide the data records that can allow readers to download and use the data. Third, we outline the technical validation exercises that we performed to confirm the quality of the datasets for each specific source. Fourth, we provide usage notes for development researchers to make use of the data, and introduce short case studies to highlight the potential use of this source. Finally, we outline how we make our processing and validation code available to all users.
Methods
Administrative aggregation
In order to develop the GLocal dataset, we first needed to select a standard for the spatial reach of geographic units throughout the world. While similar efforts have focused on apolitical territorial “grids” that are exogenous to economic, political and cultural features of interest35,36, we focus on established administrative areas. We do so because these are of special interest for policy-making purposes, as they align with the separation of sovereignty and policy-making authority, and constitute a shared references for citizens and governments on the classification of different territories. We used the Database of Global Administrative Areas (GADM) in its 3.6 version37. We took GADM polygon shapefiles for administrative units at levels 0 (e.g. Republics, Kingdoms), 1 (e.g. States, Provinces), and 2 (e.g. Municipalities, Districts), and calculated their respective geometric centroids. Moreover, we used population data from 202038 to also calculate the respective population-weighted centroids.
With polygon borders and point centroids characterizing each administrative area, we have all the inputs necessary to link all locations to the different data sources. Table 1 provides a taxonomy to organize the information available in the GLocal dataset into distinct categories. The different categories are classified according to the original data type (A: Points, B: Lines/Polygons, or C: Rasters) and their topic (I: Ecological, II: Political/Demographic, or III: Economic). The accompanying data codebook provides specific details on all the variables created, including units, available time period, periodicity, licensing and terms of use, source URLs and citations. Importantly, the codebook is organized according the categories shown in Table 1. For example, variables on Nighttime lights are the first subtopic of variables of Type C and Topic III. Hence, these variables can be found in the codebook under category C.III.1.
Point sources
Point sources identify the geolocation of different events or elements of interest. For these sources, we were interested in either counting the total number of points that intersect with each administrative polygon, creating a binary marker for the presence of such intersections, or calculating either the distances or the travel times between each administrative centroid and its nearest point. All these calculations were performed in R with standard functions from the sf package39 that intersect points to polygons or calculate geodesic distances between points using the same geographic projection. The notable exception was the calculation of driving times to trade infrastructure and markets. In this instance, we used a least-cost path algorithm on Google Earth Engine following the methodology and data outlined in40 to assess driving times between the pixels in each administrative area and different points of interest (e.g. Cities, Large Cities, Ports, Airports), and then obtained the median travel time to various resources from each administrative area.
Line/Polygon sources
Similar to point-coded information, we were interested in the intersection of administrative polygons and the distance between administrative centroids with the lines and polygons characterizing different sources. The relevant geodesic distances were calculated using standard functions from the sf package39 in R. A key exception was the calculation of the road density in an administrative area. In this case, we took worldwide road network data from OpenStreetMaps via the GRIP global roads database41, and calculated the total length of roads that fall entirely within each administrative unit. For the purpose of computational efficiency at a global scale, we used the Mollweide equal-area projection in calculating the length of road networks within each administrative polygon.
Raster sources
Raster-coded sources incorporated in the GLocal dataset are either categorical or continuous. Categorical rasters classify each pixel into types. In these cases, we calculated the total area in each administrative unit that falls in each of the relevant categories. Continuous rasters provide numeric information about the magnitude of a certain phenomena in each pixel. We aggregated continuous rasters into total sums or area-weighted means or medians for each administrative unit. The dataset’s codebook identifies the specific aggregation function performed for each of the raster aggregation variables. Rasters were ingested using the terra and raster packages in R42,43. All raster aggregations were performed with zonal statistics functions in R using the exactextractr package44. The values of border pixels were split according to the share of the pixel’s area that falls in each administrative polygon. That is, our methodology currently assumes that the data for a pixel is uniformly distributed across the pixel, which is a simplification. A notable exception is that of Custom VIIRS Nighttime Lights and the land cover variables. The custom cleaning and aggregation routines pursued in this case were performed on Google Earth Engine, which fully assigns border pixels’ values to the administrative polygons that coincide with each pixel’s centroid. For more coarse resolutions, since vector boundaries often intersect with pixels, these simplifications can introduce some inaccuracies. We have used the most detailed resolution available to reduce these inaccuracies, but this remains a limitation.
Data Records
The GLocal datasets are available in the Harvard Dataverse: https://doi.org/10.7910/DVN/6TUCTE45. The repository includes 14 data files containing GLocal aggregations, and some supporting data. Of the 14, there are nine files that correspond with each combination of three levels of administrative detail (GID0, GID1 and GID2) and three periodicity levels (annual, monthly, and cross-section). Cross-section refers to sources that are either fixed characteristics (such as elevation) or were just measured once. For data measured only once, the year of measurement is specified in the data codebook. There are three annualized files with information that has yearly or multi-year periodicity, and monthly files with sources that provide monthly variation. The repository has two additional cross-section datasets with information about the agroecological suitability of different crops, and the presence of deposits of different minerals. We provide these files in both CSV and Parquet data formats. Additionally, the repository includes summary and validation pdf files that outline the characteristics of each variable in the dataset. Finally, the users will find a thorough codebook of variable characteristics.
We have included selected summary statistics (min, max, median, mean) by country of each variable at the GADM levels 1 and 2 as part of the output from the data validation process in order to allow the user to get a high-level understanding of the data. These summary statistics along with the code required to produce them are available in the GitHub repository.
Technical Validation
Data validation was conducted in three stages for each of the three levels of geographic specificity - integrity, completeness and aggregation consistency.
Integrity examined that each region had exactly one row for each year, even if no data was present in the row. In short, it assessed if the data set was “rectangular." In the most recent iteration of the data, all regions in all temporal and spatial aggregations are present for all periods, indicating perfect integrity.
Completeness conducted two tests. The first was assessing the prevalence of NA’s for variable in a given region within the variables’ time range. For instance, the “Rain - GPCP" variable has data for every year in the 1979 - 2021 range, and therefore should have zero NA’s in this range.
The second test implemented simple internal validity measures to ensure the consistency of data - providing summary statistics like minima, maxima, means and medians for each variable in each data set. Every observation was compared against indicators for anomalies, such as a nightlight count below zero, which would be impossible, or elevations above or below geographic extremes, to ensure raster processing errors were rectified. The results for both tests for each variables are available in PDF form in the validation section of the code repository.
Finally was an aggregation consistency check - making sure that, for relevant statistics, the higher level geographic areas were reflective of the sum or geographically weighted average of their constituent parts. For instance, GID0 areas should have a nightlight count value equal to the sum of all of their GID1 subdivisions, and each GID1 subdivision should be the sum of its GID2 subdivisions.
Results indicated that for relevant variables, aggregation was within 1 percent of expected values, with over half being within .01 percent. Specifically, GDELT values do not aggregate uniformly across levels due to the fact that the geolocation assigned to some events in the original data are set to be representative of broader levels of geographic precision. Therefore, they were excluded from aggregation consistency checks. A table summary of the consistency of aggregation for each variable is available in the validation section of the code repository.
We also conducted an external validity check against nighttime lights data aggregated by GeoQuery46. GeoQuery is a research initiative hosted by the AidData lab at William & Mary, which also facilitates the aggregation and analysis of certain geospatial features for academic research. GeoQuery uses different administrative boundaries from GLocal, except for the USA, where we are able to compare the two at the state level (GID-1). Figure 1 contains a correlation plot of GeoQuery aggregated vs. GLocal aggregated values for nighttime lights for the year 2020. The values involving the GeoQuery aggregated values are for USA only, whereas the other values are for all countries. Most of the variables are highly correlated, except for dmsp_ext, which contains an extension of the DMSP data. This is expected, as the DMSP data is known to have top-coding issues in urban areas, and are also much less precise as compared to VIIRS.
Usage Notes
While the GLocal datasets provide a clean collection of information to enable development research and policy analyses that rely on international scope and subnational precision, we advise users to exercise judgement in the use of this source and in the interpretation of analyses based on it. A first consideration has to do with the nature of administrative boundaries. As discussed above, while we focus on administrative areas with political and policy meaning in their specific countries, similar efforts have focused on arbitrary “grids” that are not a function of political, economic and cultural features of interest. A separate concern is that local economic and political dynamics of interest may aggregate spatially in ways that transcend both grids and administrative boundaries. Indeed, the UN Statistical Commission has endorsed the “Degree of Urbanization” approach to identify cities, towns, semi-dense areas and rural areas in an internationally comparable manner, leveraging geocomputation methods to outline contiguous areas with meeting population density criteria47,48. Finally, users should be mindful that the degrees of political authority in administrative areas coded as being in the same “level” under the GADM classification can be drastically different across national boundaries. While all these considerations invite users to exercise judgement in deciding which levels of analysis to focus on, we expect to add grid-based and urbanization-based spatial aggregations to the GLocal Datasets in the near future.
Another relevant consideration for users to exercise judgement on is whether the information provided in the GLocal dataset is the right measure for a given phenomena of interest. For example, we provide information on temperatures and precipitation from different sources. While this information is important in itself36, highlight the importance of focusing on “anomalies” to adequately approximate for weather shocks. Given that estimating such anomalies (and many other transformations of interest) at the administrative level does not require for additional geospatial computations, we opted to only provide zonal aggregations in the GLocal dataset and allow users to exercise produce the transformations pertinent to their analyses.
Similarly, users must be mindful of the type and quality of the variation that is captured by each of the sources in different periods. For example, information on Nighttime lights has been used to approximate for local levels of economic development. However, earlier measures of local nighttime lights based on the Defense Meteorological Satellite Program (DMSP) had problems identifying low emissions in rural areas and also suffered from top-coding problems in relatively developed areas. While data from DMSP was originally released from 1992 to 2013, the program has continued collecting measurements. More precise measures of nighttime lights now come from the Visible Infrared Imaging Radiometer Suite (VIIRS) instrument aboard the joint NASA/NOAA Suomi National Polar-orbiting Partnership (Suomi NPP). There have been efforts to match the DMSP and the VIIRS data to build longer time series at the DMSP unit scale. In the GLocal Dataset, we provide the old DMSP data, the new DMSP data, the VIIRS data and the transformed VIIRS data into the DMSP scale as separate features, allowing the users to consider the specific advantages and drawbacks of the different sources thoughtfully before performing their analyses.
Lastly, users should be mindful that sources that do not originate in satellite observations may suffer from measurement errors that correlate with baseline levels of economic development. For example, it is possible that road density is underestimated in developing areas if existing roads in such environments are less likely to be included in OpenStreetMaps. Moreover, travel times estimates are calculated assuming constant movement at road speed limit, which does not consider how average congestion or environmental factors may influence the impedance of traversing different segments of the road network.
Below, we show how cross-sectional information from the GLocal dataset can be used to assess the relationship between the level of development of an administrative area as captured by its measure of nighttime lights radiance (VIIRS) per 1,000,000 people in 2019 and the average proximity (inverse driving time) between its cities and their respective closest city of more than 1,000,000 inhabitants. Panel A of Fig. 2 shows a positive relationship between nighttime lights per capita and proximity to large cities for administrative units at the GID 1. Importantly, Panel B shows that areas close to large cities are relatively dense. While there might be a direct connection between proximity to large cities and living standards, it may be mediated by administrative areas becoming relatively dense. We explore this question by assessing how the linear association between proximity to cities and nighttime lights per person changes after conditioning for local population density. Columns 1 and 2 of Table 2 evaluate this linear association for GID 1 administrative areas after including country (GID 0 Level) fixed effects, showing that the positive relationship between proximity to large cities and nighttime lights is reversed after conditioning for population density. Columns 3 and 4 (5 and 6) evaluate the linear association for GID 2 administrative areas controlling for GID 0 (GID 1) fixed effects, and show that the positive connection between proximity to large cities and nighttime lights per person attenuate by more than 85% after conditioning for population density.
To further aid in the comprehension and practical application of the dataset, we have developed a website, glocal.streamlit.app, which offers a user-friendly interface for exploring and visualizing the dataset. Additionally, we have prepared a Jupyter notebook that walks users through an additional example application of the dataset. The code used for these case studies have been stored in this GitHub repository.
Code availability
All the relevant code for replicating the different variables in the GLocal datasets are publicly available to users at this GitHub repository. We have added README files that explain how to replicate each component of the study. Note that acquiring each underlying input dataset for aggregation involves different processes. Wherever possible, download scripts have been included to simplify the process of downloading each dataset. However, in some cases, the data is not publicly available, and we have provided instructions on how to request access to the data (mentioned in the codebook).
References
Rueda-Sanz, A. & Cheston, T. The economic tale of two amazons: Lessons in generating shared prosperity while protecting the forest in the peruvian and colombian amazon. Tech. Rep., Center for International Development at Harvard University (2023).
Hodler, R. & Raschky, P. A. Regional favoritism. The Quarterly Journal of Economics 129, 995–1033 (2014).
De Luca, G., Hodler, R., Raschky, P. A. & Valsecchi, M. Ethnic favoritism: An axiom of politics? Journal of Development Economics 132, 115–129 (2018).
Wang, X. et al. Estimation and mapping of sub-national gdp in uganda using npp-viirs imagery. Remote Sensing 11, 163 (2019).
Nechaev, D. et al. Cross-sensor nighttime lights image calibration for dmsp/ols and snpp/viirs with residual u-net. Remote Sensing 13, 5026 (2021).
Ghosh, T. et al. Extending the dmsp nighttime lights time series beyond 2013. Remote Sensing 13, 5004 (2021).
Baugh, K., Elvidge, C. D., Ghosh, T. & Ziskin, D. Development of a 2009 stable lights product using dmsp-ols data. Proceedings of the Asia-Pacific Advanced Network 30, 114 (2010).
Elvidge, C. D., Baugh, K. E., Zhizhin, M. & Hsu, F.-C. Why viirs data are superior to dmsp for mapping nighttime lights. Proceedings of the Asia-Pacific Advanced Network 35, 62 (2013).
Elvidge, C. D., Zhizhin, M., Ghosh, T., Hsu, F.-C. & Taneja, J. Annual time series of global viirs nighttime lights derived from monthly averages: 2012 to 2019. Remote Sensing 13, 922 (2021).
Brown, C. F. et al. Dynamic world, near real-time global 10 m land use land cover mapping. Scientific Data 9, 251 (2022).
Friedl, M. & Sulla-Menashe, D. Modis/terra+ aqua land cover type yearly l3 global 500m sin grid v061. NASA EOSDIS Land Processes DAAC: Sioux Falls, SD, USA (2022).
Warszawski, L. et al. Center for international earth science information network-ciesin-columbia university.(2016). gridded population of the world, version 4 (gpwv4), Population density. palisades. ny: Nasa socioeconomic data and applications center (sedac). Atlas Environ. Risks Facing China Under Clim. Chang. 228, https://doi.org/10.7927/h4np22dq (2017).
Leetaru, K. & Schrodt, P. A. Gdelt: Global data on events, location, and tone, 1979–2012. In ISA annual convention, vol. 2, 1–49 (Citeseer, 2013).
Harris, I., Osborn, T. J., Jones, P. & Lister, D. Version 4 of the cru ts monthly high-resolution gridded multivariate climate dataset. Scientific data 7, 109 (2020).
Adler, R. F. et al. The version-2 global precipitation climatology project (gpcp) monthly precipitation analysis (1979–present). Journal of hydrometeorology 4, 1147–1167 (2003).
Schneider, U. et al. Gpcc full data reanalysis version 7.0 at 0.5°: Monthly land-surface precipitation from rain-gauges built on gts-based and historic data (global precipitation climatology centre, 2015). Atmosphere (Basel) 9 (2018).
Gesch, D. & Greenlee, S. Gtopo30 documentation. US Department of the Interior US Geological Survey (1996).
Nunn, N. & Puga, D. Ruggedness: The blessing of bad geography in africa. Review of Economics and Statistics 94, 20–36 (2012).
Megginson, D. Ourairports (2021).
Geonode global ports dataset (2021).
Hansen, M. C. et al. High-resolution global maps of 21st-century forest cover change. science 342, 850–853 (2013).
FAO, I. Global agro ecological zones version 4 (gaez v4) (2021).
Geonode global ports dataset (2013).
Kelso, N. V. & Patterson, T. Introducing natural earth data-naturalearthdata. com. Geographia Technica 5, 25 (2010).
Schweitzer, P. Record quality tables for the mineral resources data system. US Geological Survey data release (2019).
Elvidge, C. D. et al. A fifteen year record of global natural gas flaring derived from satellite data. Energies 2, 595–622 (2009).
Bennett, J.OpenStreetMap (Packt Publishing Ltd, 2010).
Opencellid (2023).
Bartholomew, C. Mobile coverage maps. Glasgow: Collins Bartholomew Ltd (2020).
Estima, J., Fichaux, N., Menard, L. & Ghedira, H. The global solar and wind atlas: a unique global spatial data infrastructure for all renewable energy. In Proceedings of the 1st ACM SIGSPATIAL International Workshop on MapInteraction, 36–39 (2013).
Edgar (emissions database for global atmospheric research) community ghg database (2022).
van Donkelaar, A. et al. Documentation for the global annual pm2. 5 grids from modis, misr and seawifs aerosol optical depth (aod) with gwr, 1998-2016. Palisades NY: NASA Socioeconomic Data and Applications Center (2018).
Raleigh, C., Linke, R., Hegre, H. & Karlsen, J. Introducing acled: An armed conflict location and event dataset. Journal of peace research 47, 651–660 (2010).
Ruggles, S., King, M. L., Levison, D., McCaa, R. & Sobek, M. Ipums-international. Historical Methods: A Journal of Quantitative and Interdisciplinary History 36, 60–65 (2003).
Tollefsen, A. F., Strand, H. & Buhaug, H. Prio-grid: A unified spatial data structure. Journal of Peace Research 49, 363–374 (2012).
Schon, J. & Koren, O. Introducing afrogrid, a unified framework for environmental conflict research in africa. Scientific data 9, 116 (2022).
Hijmans, R., Garcia, N. & Wieczorek, J. Gadm: database of global administrative areas, version 3.6. GADM Maps and Data (2018).
for International Earth Science Information Network-CIESIN-Columbia University, C. Gridded population of the world, version 4 (gpwv4): Population density, revision 11. NASA Socioeconomic Data and Applications Center (SEDAC) (2018).
Pebesma, E. & Bivand, R.Spatial Data Science: With applications in R (Chapman and Hall/CRC, 2023).
Weiss, D. J. et al. A global map of travel time to cities to assess inequalities in accessibility in 2015. Nature 553, 333–336 (2018).
Meijer, J. R., Huijbregts, M. A. J., Schotten, K. C. G. J. & Schipper, A. M. Global patterns of current and future road infrastructure. Environmental Research Letters 13, 064006, https://doi.org/10.1088/1748-9326/aabd42 (2018).
Hijmans, R. J. terra: Spatial Data Analysis R package version 1.7-58. (2023).
Hijmans, R. J. et al. Raster: Geographic Data Analysis and Modeling (2023).
Baston, D. exactextractr: Fast Extraction from Raster Datasets using Polygons https://isciences.gitlab.io/exactextractr/, https://github.com/isciences/exactextractr (2023).
Morales-Arilla, J. & Gadgin Matha, S. Glocal: A global development dataset of local administrative areas. Harvard Dataverse https://doi.org/10.7910/DVN/6TUCTE (2023).
Goodman, S., BenYishay, A., Lv, Z. & Runfola, D. Geoquery: Integrating hpc systems and public web-based geospatial data tools. Computers & Geosciences 122, 103–112 (2019).
Commission, U. N. S. et al. Report on the fifty-first session (3–6 march 2020). UN Doc. E/CN 3, 37 (2020).
Dijkstra, L. et al. Applying the degree of urbanisation to the globe: A new harmonised definition reveals a different picture of global urbanisation. Journal of Urban Economics 125, 103312 (2021).
Acknowledgements
We are thankful to Luis Da Silva, Sarah Bui, Gabriel Kelvin, Ana Ibarra and Rui Alleyne for invaluable research assistance. This work was sponsored by the Growth Lab at Harvard University.
Author information
Authors and Affiliations
Contributions
J.M.A. conceived the project and developed dataset prototypes, S.G.M. led the streamlining of geospatial routines into a cogent and replicable pipeline. Both authors contributed equally to writing and reviewing the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Morales-Arilla, J., Gadgin Matha, S. GLocal: A global development dataset of subnational administrative areas. Sci Data 11, 851 (2024). https://doi.org/10.1038/s41597-024-03539-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-024-03539-y