What do we know about poverty in North Korea?

Reliable quantitative information on the North Korean economy is extremely scarce. In particular, reliable income per capita and poverty figures for the country are not available. In this contribution, we provide for the first time estimates of absolute poverty rates in North Korean subnational regions based on the combination of innovative remote-sensed night-time light intensity data (monthly information for built areas) with estimated income distributions. Our results, which are robust to the use of different methods to approximate the income distribution in the country, indicate that the share of persons living in extreme poverty in North Korea may be larger than previously thought. We estimate a poverty rate for the country of around 60% in 2018 and a high volatility in the dynamics of income at the national level in North Korea for the period 2012–2018. Income per capita estimates tend to decline significantly from 2012 to 2015 and present a recovery since 2016. The subnational estimates of income and poverty reveal a change in relative dynamics since the second half of the 2012–2018 period. The first part of the period is dominated by divergent dynamics in income across regions, while the second half reveals convergence in regional income.


Introduction
O btaining reliable quantitative information on the North Korean economy is notoriously difficult. Official (albeit not credible) data on aggregate income or production have not been published since 1990 (Kim et al., 2007). The World Factbook by the Central Intelligence Agency (Central Intelligence Agency, 2017) estimates gross domestic product (GDP) per capita on a purchasing power parity basis in the country to be $1700 in 2015, a figure which is based on an extrapolation of estimates obtained by Angus Maddison in the framework of a project initiated by the Organization for Economic Co-Operation and Development (OECD, Maddison, 1995Maddison, , 2001. The availability of satellite data on night-time light emissions and other observable terrain characteristics has opened new possibilities in terms of understanding and visualizing global socioeconomic developments worldwide (Proville et al., 2017) and provides an important tool to assess income levels and dynamics in countries where data do not exist or are completely unreliable (Henderson et al., 2012;Noor et al., 2008).
Our study contributes to the growing literature on the use of remote sensing data in social science applications (Proville et al., 2017) by providing novel empirical evidence on the level and dynamics of economic development in a country where quantitative information on income or poverty are not available. In addition, it proposes a method to estimate poverty in data-poor environments that can be used to assess empirically the dynamics of the distribution of income across individuals in world regions where little official information can be used. Our method relies on novel nightlight data extracted at the monthly frequency using a mask that allows us to concentrate on built areas when analyzing luminosity levels and changes.
We present the first estimates of income and poverty in North Korea at the subnational level employing a methodological framework, which is based on the estimation of GDP making use of remote-sensed luminosity data and combining these income estimates with distributional assumptions based on the similarity between the (few) measurable characteristics of the North Korean economy and those of other countries for which income distribution figures are available. The results obtained indicate that the few existing attempts to obtain reliable poverty figures for the country (Crespo Cuaresma et al., 2018) may have underestimated the incidence of extreme poverty. A rapid fall in income per capita estimates from luminosity data, coupled with an increase in the poverty rate, took place between 2013 and 2015 and this trend reversed in the period 2016-2018.

Results
In order to obtain estimates for total GDP at the regional level in North Korea, we use information sourced from the Suomi National Polar-orbiting Partnership Visible Infrared Imaging Radiometer Suite (NPP/VIIRS), the latest generation of data on remote-sensed luminosity, which has been shown to have better properties than those from the Defense Meteorological Satellite Program's Operational Linescan System (DMSP-OLS), the only existing satellite sensor used to collect night-time light until 2011 (Cao et al., 2013;Li et al., 2013;Ou et al., 2015). We employ the monthly VIIRS DNB composites produced by the Earth Observation Group, NOAA/NCEI. The yearly VIIRS DNB data products cover only 2015 and 2016, so we extracted median nighttime luminosity per pixel from the monthly VIIRS DNB products spanning the years 2012 to 2018 in order to reconstruct the dynamics of luminosity in North Korea for the full 2012-2018 period. The data extraction was limited to built up areas using the World Settlement Footprint 2015, a product from the German Aerospace Center, DLR (Marconcini et al., 2019). Median values were chosen to avoid potential contamination by outliers, a particularly prevalent problem due to the fact that the monthly VIIRS DNB products are not processed to the extent the yearly products are and may contain contamination that may bias the estimates of luminosity. An additional correction was carried out for the 2017 and 2018 products, since a level shift in radiance intensity in the whole VIIRS DNB product takes place from 2016 to 2017 and 2018. The correction was carried out using the trend observed for South Korea from 2012 to 2016 and projecting the trend to 2017 and 2018. The observed difference between the projected and the observed luminosity amounted to 9.43%. Figure 1 presents a visualization of average radiance for North Korea in the year 2015, and Fig. 2 presents the relative change in luminosity by region for the period 2012-2018. The dynamics of luminosity present a heterogeneous picture of night-time light developments throughout the country for this period, with decreases in light emissions in some parts of Pyongyang, as well as in smaller cities for the period 2012-2016, combined with scattered increases in luminosity throughout the country. The period 2016-2018 is dominated by increases in luminosity in most regions of the country. The night-time light intensity data are transformed to GDP estimates at the level of subnational regions (Chagang-do, Hamgyong-bukto, Hamgyong-namdo, Hwanghae-bukto, Hwanghae-namdo, Kangwon-do, P'yonganbukto, P'yongan-namdo, P'yongyang-si, Yanggang-do) making use of the results of the regression analysis based on Chinese prefecture data (Li et al., 2013). The luminosity-based estimate for GDP per capita in North Korea for 2018 using this method is approximately $790 (2011 PPP-adjusted), a level of income that would be among the lowest in the world, and its change in the period 2012-2018 is depicted in Fig. 3.
While the dynamics implied by changes in night-time lights for the period 2012-2016 indicate an overall fall in income throughout the period that affected practically all subnational regions in North Korea, this trend is partly reverted in the period 2016-2018. Furthermore, the period 2012-2016 was dominated by a trend towards divergence in average income per capita across regions within the country, with relatively poor regions experiencing on average larger falls in GDP per capita. The so-called Williamson hypothesis (Williamson, 1965) predicts an inverse Ushaped relationship between the state of economic development and cross-regional inequality within the country (Barrios and Strobl, 2009). A trend towards increasing regional disparities in income per capita is thus a typical characteristic of economies at relatively low levels of development. The change in dynamics observed in 2016-2018 leads to overall income convergence (as measured by luminosity) across North Korean subnational regions (Fig. 4). On average, the growth of luminosity has been higher for regions which started at low levels of night-time light intensity per capita in 2012.
Evaluating extreme poverty dynamics in the country requires the estimation of income distributions for the different regions of North Korea. We approximate these making use of a method based on the estimation of Beta-Lorenz curves for all countries of the world for which data are available and constructing regionspecific income distributions based on matching the existing subnational socio-demographic and economic data with those of other economies. Making use of these assumed income distributions, we obtain an estimate of the poverty rate in North Korea aggregating the amount of poor persons by region estimated for each one of the subnational regions. Mimicking the development of luminosity, for the period 2012-2016 this estimate implies a large widespread increase in poverty, followed by a decrease of poverty, which makes the estimate return to values similar to those observed in 2012, with an estimate of around 60% for the whole country but a large variation across subnational regions (Fig. 5). Such poverty rate estimates contrast with existing figures for North Korea obtained using fitted values from linear regression analysis at the macroeconomic level, such as those obtained using the methods proposed by Ravallion (2012) or Crespo Cuaresma et al. (2018), which indicate a share of persons currently living in extreme poverty of around 40%, a significantly smaller poverty incidence than that obtained using the method based on night-time luminosity. The dynamics of poverty within the country in the period 2012-2018 mimic the results we found for income per capita, poverty divergence between 2012 and 2016 followed by a convergence trend that results in a reduction of poverty variance across regions for the full period. The relatively low poverty rates in Yanggang-do should be interpreted with care, since it is mostly driven by very low population estimates based on the census figures and our particular assumptions of zero population growth differentials across regions over the period considered. The high levels of poverty in southern regions of the country, which may be more affected by the conflict with South Korea, correlate with the available figures on nutritional status of children by province in 2002 and 2004 (Smith, 2009).  The Bank of Korea provides estimates of the growth rate of GDP per capita in North Korea, 1 with a method, which is not reported in any of the available documents beyond being described as "using the basic data on production quantities supplied by relevant institutions" (Bank of Korea, 2017). The resulting estimates of Gross National Income per person in North Korea by the Bank of Korea correspond to 4.6% of that for South Korea. Both our (average) poverty rate estimates and its yearly change have a large negative correlation with the GDP growth rate estimates provided by the Bank of Korea for the period 2013-2016, which lends extra credibility to the dynamics obtained using our poverty figures. The resulting number of persons living in extreme poverty in North Korea in 2018 implied by the method ranges from roughly 14.9 million (assuming the income distribution corresponding to Romania in 1999) to 17 million (assuming the income distribution of Armenia in 1996).

Methods
From light intensity to GDP levels. We convert the NPP/VIIRS luminosity data by subnational region (Chagang-do, Hamgyongbukto, Hamgyong-namdo, Hwanghae-bukto, Hwanghae-namdo, Kangwon-do, P'yongan-bukto, P'yongan-namdo, P'yongyang-si, Yanggang-do) to GDP in US dollars making use of the empirical relationship between night-time light and GDP described in Li et al. (2013), which provides estimates for the elasticity of GDP to   Li et al. (2013), the estimates of total GDP by region in North Korea can be considered reasonably reliable. We make use of the data on population by region from the 2008 census in North Korea and extrapolate them to the year 2018 using the national-level population growth rate of population implied by the figures from the 2017 revisions from the United Nation's World Population Prospects (United Nations, 2017). We do not make an attempt to model cross-regional migration in the period under scrutiny, which may lead to a bias in our assessment of income convergence trends if internal migration took place from poorer to richer parts of the country during these years. The assumption of constant population growth across regions may also have a significant effect on the estimates of the dynamics of GDP per capita, but due to the lack of information on internal migration patterns in the country, it is impossible to complement the projections with a model for internal population mobility.
Estimating poverty in North Korean regions. In order to assess quantitatively the potential range of poverty rates both at the national level and in North Korean regions, we compute the share of people living in extreme poverty (as measured by those earning <$1.90 a day) implied by imposing the income distribution corresponding to those economies, which are most similar to North Korea in terms of income per capita, educational level and demographic and sectoral structure. For this purpose, we construct the Euclidean distance between the vector composed of normalized measures of (a) employment share by economic sector (agriculture, manufacturing, services), (b) share of population by age (in 5-year groups) and gender, (c) share of persons by educational attainment level (primary, secondary, tertiary), and (d) GDP per capita, for North Korea and all countries of the world for which these data are available. We select the seven closest economies and anchor their respective income distributions (based on estimating a Beta-Lorenz curve) on the average income per capita of North Korea as obtained by transforming the luminosity data to GDP per capita figures.
The data for education and age structure are sourced from the Wittgenstein Center for Demography and Global Human Capital (http://www.wittgensteincentre.org/dataexplorer). GDP per capita in constant 2011 PPPs comes from the World Economic Outlook (WEO) and employment by sector is sourced from the World Bank's World Development Indicators (WDI) database. For North Korea, GDP per capita is estimated using luminosity data and the rest of the variables are from the North Korean 2008 census.
Similarity between economies is defined based on different methods, using as input the four vectors that summarize the variables described above (age, employment, education, and GDP per capita). These methods are alternatively based on identifying the closest k economies using Euclidean distance and employing averages of these or using weighted averages employing the Euclidean distance between these vectors as weights. We start by calculating the Euclidean distance between the vectors of variables for North Korea (and its subnational regions) and those of all country/year pairs for which data are available. We normalize the distance vectors to fall in the range 0 and 1 and sum over the four vectors (age, employment, education, and GDP per capita) to obtain a ranking of similarity that allows us to select the the k countries whose characteristics are the closest to those of North Korea.
Data availability restrictions leads to a group of 34 countries spanning 40 country/year observations, which can be used for validation of our poverty estimation method. For our validation exercise we apply values of k ranging from 3 to 10 and assess the predictive power of the method for reconstructing income distributions for country/years where data are available. For a given k, the median poverty rate of the countries with the smallest Euclidean distance to the vectors of socioeconomic characteristics (age, employment, education and GDP per capita) is used as an estimate. Alternatively, we also employ a weighted average of the poverty rates of all potential comparator countries weighted with the inverse of the similarity measure given by the corresponding Euclidean distance to the economy of interest. Versions of this weighting scheme based on second, third, and fourth powers of these weights are also applied and compared in the crossvalidation exercise. Our metric of evaluation is the mean-squared prediction error of the estimate when compared to the actual survey-based poverty rate figures. For the construction of the vectors of characteristics, two alternatives are used depending on whether the distance is based on the level of GDP per capita or its logarithm. This leads to 24 potential variants when implementing the method based on the choice of (a) a value of k = 3, …, 10 or a weighting scheme (first to fourth power of the inverse Euclidean distance) and (b) whether GDP per capita is used in levels or in log-levels. We conduct the validation exercise using the predictions of poverty rates implied by all 24 variants for the 40 country/year available observations. The evaluation of predictive power is carried out making use of the root-mean-squared prediction error and the results are presented in Table 1. The validation results confirm that the approach based on the median estimates from the k closest country/year observations is superior to the method based on weighting for all powers of the distance measure. In addition, estimates based on log-transformed GDP per capita figures also tend to provide more precise poverty predictions within the group of fixed k models. The distributions of prediction errors across choices of k and by country/year observation are depicted in Figs 6 and 7, respectively. The model based on the choice of k = 7, which corresponds to the specification that performs best in the validation exercise, does not deliver large outliers in the tail of negative prediction errors, which tend to be endemic for other choices of k.
For our choice of k = 7, the most similar economies to North Korea in terms of age structure, sectoral composition, educational attainment and average income per capita are Romania in 1999, Albania in 2002, Madagascar in 2002, Georgia in 1996, Bangladesh in 2016, Vietnam in 2010, and Armenia in 1996 For this set of economies, we calculate the share of mean expenditure to GDP per capita using data sourced from PovcalNet, UNU-WIDER and Poverty Equity, and employ the mean expenditure per unit of GDP per capita for North Korean regions in order to anchor the income distribution. Finally, we combine the mean expenditure estimate in North Korean regions with the corresponding Lorenz curves estimated for each matched economy (Crespo Cuaresma et al., 2018) in order to obtain poverty rates. Our final estimate for a given region is the median of those rates. Overall, the set of possible choices of income distributions is composed of 1559 Lorenz curve estimates.
We also assess the robustness of our estimates applying alternative methods based on clustering. For each one of the four vectors of variables, we use agglomerative (bottom-up) hierarchical clustering to find similar economies. This method starts with each country/year observation in a different cluster. At each step, the most similar pairs of clusters are merged, and this is repeated until only one cluster is left. Cluster similarity is determined based on the most pairwise dissimilar elements in each cluster (complete linkage). The resulting dendrogram can be used to select country/years sharing a cluster with North Korea. This method, however, delivers predictions that are inferior in terms of predictive power to those obtained by setting k = 7, which is our choice for the estimation of poverty rates.

Conclusion and limitations
The availability of high-quality remote-sensed night-time luminosity data opens new avenues for the estimation of income levels and dynamics with a high degree of geographic granularity (Proville et al., 2017). Using the latest generation of data on remote-sensed luminosity, we are able to estimate income per capita and poverty rates for North Korea, as well as for its subnational regions. Our estimates reflect low-GDP per capita (around $790 per capita and 60% poverty rate) and volatile dynamics of income at the national level in North Korea for the period 2012-2018, with a sizeable decrease in the estimate from 2012 to 2015 and a recovery since 2016. Within the country, the relative dynamics of income per capita and poverty rates across subnational regions has also changed in the second half of the 2012-2018 period, with the first part of the period being dominated by divergent dynamics and the second half by convergence in regional income.
The lack of alternative credible figures of GDP per capita in North Korea make a proper rigorous validation of our exercise virtually impossible. The method chosen to estimate poverty rates from age structure, sectoral employment, education and GDP per capita information performs reasonably well for countries with