Background & Summary

After the Soviet Union collapsed, abandonment of agricultural land in the former Soviet Union (fSU) countries occurred as a result of the restructuring of the economy and the adjustment towards open-market conditions from 1990 to 2010 (refs 13). These major land-use changes have had a significant impact both regionally and globally, e.g. Schiernhorn et al.4, which illustrates impacts beyond the borders of the fSU.

Unfortunately, we still have limited knowledge of the spatial distribution of abandoned land in the fSU countries. Accurate spatial information on land abandonment is required for many studies, e.g. as a benchmark for monitoring cropland expansion and highlighting areas suitable for biomass production, but also to pinpoint opportunities for increasing ecosystem services, such as carbon sequestration on abandoned lands and increasing habitats for umbrella species59. However, existing global land cover/land use maps suffer from a high level of uncertainty e.g. refs 1012 and are not tailored towards the identification of abandoned land. For example, the global land cover time series from 1992 to 2015, produced in the framework of the Climate Change Initiative (CCI) of the European Space Agency (ESA)13, do not account for any losses in cropland over this time period yet the area sown shrank by 42.5 Mha between 1990-2010 according to national Russian statistics14. Usually global mapping initiatives, such as the ESA CCI, focus on certain types of land cover change to satisfy the needs of one group of users, addressing the needs of other users only partially. For the development of this recent ESA CCI land cover product, the CCI community did not prioritize mapping of cropland change but rather focused on forest loss and gain.

At the same time there have been efforts to map abandoned land of small study plots as well as regionally1518. For example, Prischepov et al.15 have developed a map of abandoned arable land at a 30 m resolution for a few study plots in Russia, Poland and Lithuania, covering the period 1990–2000 while Kraemer et al.17 have mapped a cropland time series for 1990–2010 based on Landsat imagery covering two study plots in Kazakhstan. Another example is a map of farmland abandonment by Estel et al.18, which is based on MODIS time series that covers all of Europe for the period 2001–2012. The spatial and temporal extent of these maps is different, as well as the definitions for abandoned arable land, which makes it impossible to compare these maps directly. Moreover, these maps do not fully cover Kazakhstan or the non-European part of Russia. Hence there is a clear need to develop an accurate map of abandoned land that covers the whole fSU.

This paper presents a state of the art hybrid map of current arable and abandoned land for eight fSU countries (Armenia, Azerbaijan, Belarus, Georgia, Kazakhstan, Republic of Moldova, Russian Federation and Ukraine). By fusing the best available, global and regional spatial information together, the map provides information on land abandonment by 2010. We have used training data in the data fusion methodology, which were collected by visual interpretation of very high resolution (VHR) imagery using Geo-Wiki19,20, to increase the quality of the map. With a second independent Geo-Wiki data set, we have assessed the accuracy of this product.

Methods

In this study, we aimed to collect and fuse existing sources of information, including indicators of land abandonement derived from remote sensing data. These include abandoned arable land maps that were produced by classification of Landsat imagery17; classification of MODIS-based time series of Normalized Difference Vegetation Index (NDVI)18,21; or downscaling of statistical data on abandoned land based on the calculation of a “so-called” cropland suitability index1. Among different existing data fusion approaches, e.g. regression, decision trees or neural networks, we have chosen the Naïve Bayes classifier22. Naïve Bayes is the basic form of a Bayesian Network and, as such, is a direct implementation of the Bayes’ theorem. It is easy to implement, can be updated dynamically, and deals easily with missing data. Moreover, it has been shown to perform well on most classification tasks and is often significantly better than other classification methods23,24.

Figure 1 presents a flowchart of the methodology used to create the hybrid map of arable and abandoned land. We first collected land cover maps from different epochs as well as regional maps of abandoned land. Moreover, with the help of regional experts using the Geo-Wiki19,20 land cover tool, we developed a reference (training) data set on arable and abandoned land, using visual interpretation of VHR historical imagery from Google and Bing. We then integrated the different land cover and abandoned land maps with the Geo-Wiki reference data set using a data fusion algorithm to produce a hybrid map of arable and abandoned land. The target resolution of the final product is 10 arc-second (ca 300 m at the equator) to match the geometry and spatial resolution of two input products: the hybrid global land cover map25 and the ESA CCI land cover 13 products.

Figure 1
figure 1

A flowchart of the methodology used to create the hybrid map of arable and abandoned land.

Map legend and definitions

As one of the inputs, we used land cover maps that include cropland as a land cover class. However, cropland or arable land is a land use class according to the definition provided by the Food and Agriculture Organization (FAO) of the United Nations. Therefore, in this paper, we refer to arable and abandoned as land use classes.

National statistics on land include the following land use classes based on definitions from FAO26 with specific regional differences:

  • Arable land is land under temporary crops, temporary meadows for mowing or pasture, land under market or kitchen gardens and land temporarily laid fallow (less than five years). Temporarily fallowed land is land set aside for one or more years before being cultivated again.

  • Sown area refers to the area on which sowing or planting has been carried out for the crop under consideration on the soil prepared for that purpose. (http://faostat.fao.org/site/375/default.aspx).

  • Fallow land (temporary) is the cultivated land that is not sown for one or more growing seasons. The maximum idle period is usually less than five years. Land remaining fallow for too long may acquire characteristics requiring reclassification, such as "permanent meadows and pastures" (if used for grazing), "forest or wooded land" (if overgrown with trees), or "other land" (if it becomes wasteland).

  • Agricultural land refers to the land area that is arable, under permanent crops, or under permanent pastures and hayfields.

The hybrid map developed here consists of three land use classes: arable land in use, abandoned arable land, and other land uses (e.g. urban, forest, etc.).

  1. 1

    Arable land includes sown area and bare fallow (cultivated, but not seeded)

  2. 2

    Abandoned arable land is the land that was previously cultivated (i.e. belongs to the agricultural land use class) but has not been utilized for more than 5 years1,27,28. “Abandoned arable land” is almost never reported, and is calculated as the difference between the total arable land and the utilized arable land.

  3. 3

    Other land use is the land that is currently not and has never been utilized for agricultural purposes or it was formerly arable land that is now occupied by infrastructure so it can no longer be considered as potentially available for agricultural purposes.

Input maps

To be used as input data, we collected maps that provide us with the following information:

  1. 1

    Abandonment of arable land derived from remote sensing data, such as abandoned land from Alcantara21, abandoned land from Prishchepov15, etc.

  2. 2

    Series of annual land cover maps, such as MODIS land cover29 and CCI land cover13. These maps provide additional information on the transition of land cover from one type to another, e.g. from cropland to grassland, shrubland or forest.

  3. 3

    Land cover maps and cropland maps for 2010. There are many more land cover maps available for the year 2010 than for earlier years. Some maps for 2010 are more accurate than the maps for 2000 or older because it is possible to obtain better training data for the most recent years, e.g. GlobeLand3030 for 2010 compared to 2000. We consider these maps useful for delineating active cropland for 2010 and other land cover classes that are mapped with high accuracies, e.g. water, forest and bare land.

Table 1 lists the land cover and land use maps that we used as inputs to produce the hybrid map and the correspondence to the land use classes of arable utilized land, arable abandoned land, and other land uses. We then resampled the input data sets to the target resolution of 10 arc-second. In the first step we simplified the legends by merging some of the land cover classes that are similar but not relevant to agriculture, e.g. different types of forest (Supplementary Table S1). We then aggregated those maps at a lower resolution than 10 arc-seconds to a 10 arc-second resolution: for categorical data, we applied a majority rule while for continuous data, we calculated the mean. We then resampled all maps to the same grid by applying the nearest neighbor technique. Finally, we converted continuous variables (e.g. percentage cropland) to categorical ones by using a 50% threshold. Table 1 also shows the spatial and temporal coverage of each input data set.

Table 1 Land use classes and coverage of the input data sets.

Geo-Wiki reference data on abandoned land

We collected reference data on abandoned land through the Geo-Wiki platform (http://geo-wiki.org), which allows users to classify Google Earth and Bing VHR imagery. An example of the interface is provided in Figure 2. The blue box corresponds to a 10-sec pixel; in the top left corner is a time slider to view available historical imagery at this location while the user chooses the classes from the right hand panel.

Figure 2
figure 2

Screenshot of the Geo-Wiki interface to collect expert training data.

Twenty experts from the IIASA Geo-Wiki network along with partners from the AGRICISTRADE project took part in an imagery classification campaign; together they collected information at ca 15K points. These expert data were then used for training a Bayesian network to fuse the input data sets into a hybrid product.

As part of the data collection process, we asked the experts to determine if each pixel had greater than 50% arable land, 50% abandoned arable land or 50% other land. When it was impossible to define a unique class, the experts had the option to choose “Not Sure” (see Figure 2). We excluded “Not sure” locations in training the Bayesian network. The experts examined both historical imagery at each location and historical profiles of NDVI. Figure 3 provides an example of how historical VHR imagery in Geo-Wiki was used to identify abandoned land in two different cases. In particular, the increased number of shrubs over time, which is clearly visible in Figure 3, is a visual sign of abandonment. Abandoned land may include not only abandoned arable land but also abandoned pastures.

Figure 3: Examples (Geo-Wiki screenshots) of abandoned land.
figure 3

(a1) Coordinates 55.18 N 83.04 E. The image from 2004 shows cropland. (a2) Coordinates 55.18 N 83.04 E. The image from 2013 is abandoned land. (b1) Coordinates 56.02 N 37.88 E. The image from 2007 shows cropland. (b2) Coordinates 56.02 N 37.88 E. The image from 2016 and the ground truth photo from 2015 confirms that it is now abandoned land.

Bayesian network

We combined the input data sets with the Geo-Wiki reference data set using a Bayesian network to produce a hybrid map of arable and abandoned land. The Naïve Bayes classifier has been shown to perform well in classification problems e.g. refs 38,39. One of the advantages of this method is the ease with which it incorporates input data sources that have differing classifications. This means that there is no need to translate land cover classes into the same legend, e.g. the forest gain map by Hansen can indicate areas where forest gains have taken place on formerly cultivated agricultural lands39. In addition, some of the input data sets provide information for only part of the fSU region e.g. 1,15,21 but the Naïve Bayes classifier can handle missing data. Finally, this approach allows us to use input data with different temporal extents. We considered the Geo-Wiki reference data set as the truth.

We have applied the Naïve Bayes classifier as follows. Let Gi be the truth in location i, and { S } ={ S 1 i , S 2 i ,, S k i } be the readings of the k satellites in that location. In general, one can partition the set of satellite observations (input maps) into conditionally independent subsets: { S } = { { S ( 1 ) } , { S ( 2 ) } , , { S ( J ) } } , where JK is the number of such subsets. The Bayes’ formula used is:

(1) Pr ( G | { S } ) = Pr ( { S } | G ) Pr ( { S } ) = j Pr ( { S ( j ) } | G ) Pr ( G ) g j Pr ( { S ( j ) } | G = g ) Pr ( G = g )

We estimated the conditional probabilities Pr({ S ( j ) }|G) from the contingency tables for the classifications obtained through Geo-Wiki and the kth input map classification. The region-specific prior probabilities Pr(G) were assumed to be equal.

If the data are only available for a subset {S*} of {S} and missing for the rest, denoted here as { S ¯ }, then the probability becomes:

(2) Pr ( G | { S * } ) = S ¯ S Pr ( S | G ) Pr ( G ) g S ¯ S Pr ( S | G = g ) Pr ( G = g ) = S * Pr ( S | G ) Pr ( G ) g S * Pr ( S | G = g ) Pr ( G = g )

because Pr ( S | G ) =1. Thus, if no information is available from a given input data source, the corresponding terms are simply omitted from the model.

Usually after abandonment, agricultural land transforms into another land cover class, either grassland, shrubland or forest. This transformation depends on human impact, bioclimatic zone, altitude, and other factors. Therefore, the Naïve Bayes classifier was run at the ecozone level40 in order to delineate different transformation processes that follow after land is abandoned. For example, abandoned croplands in forested regions in Ukraine and Belarus will be afforested over years, while abandoned croplands in the steppe regions of Siberia and in Kazakhstan will revert to grasslands. Note that we initially ran a series of tests with different strata, such as the whole study region or with national boundaries. However, this resulted in massive ovestimation of abandoned land and was therefore abandoned in favour of the ecozone stratification.

From the application of the Bayesian approach, we obtained a probability map of cropland, abandoned arable land, and other land (summing to 1 in each pixel). Then we selected the class with the highest probability in each pixel to produce the final hybrid map product.

Example of applying the Naïve Bayes classifier at the pixel level

The following provides an example of how the Naïve Bayes classifier operates at the pixel level using a a simple situation where observations of only two satellites SA and SB are available. The satellite SA classifies observations into 3 classes, A1, A2, and A3, whereas the satellite SB classifies observations into 2 classes, B1 and B2. The conditional probabilities of observing each of the classes in arable land, or abandoned arable, or other land are given in Table 2 and Table 3, respectively G1, G2, G3. Thus, for example, the satellite SA will assign the abandoned land to classes A1, A2, and A3 with probabilities 0.8, 0.2, and 0.0 respectively, and these probabilities will sum to one. These probabilities are calculated from the Geo-Wiki reference data on abandoned land.

Table 2 Satellite A: Conditional Probabilities of observing classes A1, A2, and A3 for arable land (G1), abandoned arable(G2), and other land(G3) respectively.
Table 3 Satellite B: Conditional Probabilities of observing classes B1 and B2 for arable land (G1), abandoned arable(G2), and other land(G3) respectively.

Suppose now, that we want to estimate the probability that a cell assigned to classes A1 and B2 by satellites SA and SB respectively, is arable. Assuming that the prior probabilities of each class (G1, G2, G3) are equal to 1/3 (we rounded it to 0.3), then:

Pr ( G 1 | A 1 , B 2 ) = = Pr ( G 1 | C 1 ) Pr ( B 2 | G 1 ) Pr ( G 1 ) Pr ( A 1 | G 1 ) Pr ( B 2 | G 1 ) Pr ( G 1 ) + Pr ( A 1 | G 2 ) Pr ( B 2 | G 2 ) Pr ( C 2 ) + Pr ( A 1 | G 3 ) Pr ( B 2 | G 3 ) Pr ( G 3 ) = 0.8 * 0.4 * 0.3 0.8 * 0.4 * 0.3 + 0.1 * 0.8 * 0.3 + 0.1 * 0.5 * 0.3 = 0 . 86 ,

Note that the classes for the two satellites do not need to be in any way compatible, nor do they need to correlate strongly with the variable of interest G In terms of the estimator performance. The best results are achieved when, for any source and class C, Pr(C|G1) differs substantially from Pr(C|G2) or Pr(C|G3). On the other hand, one can see that when Pr(C|G1)=Pr(S|G2)= Pr(C|G3) for any class, the posterior distribution Pr(G1|{ S * }) will always equal the prior distribution Pr(G1). Thus, the observations will be completely uninformative.

Recommendation for mapping abandoned land in other regions of the world

The methodology presented here could be used for mapping abandoned land in other regions of the world. Two components are needed: (i) the input maps of land cover, cropland and abandoned arable land (if available) corresponding to the regions of interest; and (ii) the reference data set on abandoned arable land. The latter data set can be collected from field data or from very high resolution satellite data using an application such as Geo-Wiki or Collect Earth (http://www.openforis.org/tools/collect-earth.html). The spatial resolution of the map produced using the methodology outlined here should be dependent on the size of the abandoned fields.

Data Records

The two data records are provided in zipped files (.zip):

Figure 4 shows the hybrid map of arable and abandoned land in the fSU countries, presented in this paper.

Figure 4: Spatial distribution of arable and abandoned land in the fSU.
figure 4

Legend items: 1- arable land, 2-abandoned land, 3-other land.

The map is also available from the Geo-Wiki Agricistrade page, where we overlaid it on top of Google Maps and Bing satellite imagery using Open Layers. Users can examine the map by zooming into specific locations or gain an overview of the map by panning around the region.

Technical Validation

We have validated the hybrid map by following the procedure set out in Olofsson et al.41, which allows for the estimation of confidence intervals and adjusted areas based on confusion matrices. The validation sample design follows a two-step random stratified approach:

  1. 1

    The first stratum is by country/region: Russia, Belarus, Moldova, Kazakhstan and Ukraine as individual countries and Armenia, Azerbaijan and Georgia grouped together as the “Caucasus” region;

  2. 2

    The second stratum is by mapped class: arable utilized, abandoned land and other land cover types.

The final sample consists of 5972 pixels at a 10 arc-second resolution by country/region as follows: 1504 sample pixels in Russia; 911 in Belarus; 923 in Moldova; 915 in Kazakhstan; 922 in Ukraine; and 797 in the Caucasus. We randomly distributed the pixels across the countries with an increased number of samples in rare classes, i.e. utilized arable and abandoned land. We invited regional experts from Ukraine and Russia to classify the sample by visual interpretation of VHR historical imagery available from Google and Bing in Geo-Wiki. The experts were asked to identify the dominant land use in each sample pixel, i.e. arable utilized, abandoned land or other land. If it was difficult to determine a unique class, the experts were asked to select one of the following classes: “not sure if arable or abandoned land”, “not sure if arable or others”, “not sure if abandoned land or other land”. These “not sure” sample sites were used in the accuracy assessment. For example, if a validation site was classified as “not sure if arable utilized or abandoned land” and the mapped class was arable, then a value of 0.5 was added to the cell of the confusion matrix in the row mapped class “arable” and column reference class “arable” while the other 0.5 was added to the cell in the row mapped class “arable” and column reference class “abandoned land” (Table 6).

Table 6 Example of counting for “not sure” validation points in confusion matrices.

There are many challenges in mapping abandoned land, which are difficult to tackle and which result in low user accuracies for this land use class, for example:

  • In Moldova and Caucasus, the fields are much smaller than a 10 arc-second grid, and there are many orchards that are confused with abandoned land from remote sensing;

  • In the forest-steppe and forest zones of Ukraine and Belarus, where the majority of abandoned lands are allocated in these countries, the landscapes are very fragmented and therefore difficult to map from remote sensing;

  • In Kazakhstan, abandoned lands change from arable to grassland, which is the land cover transition type that is very difficult to map in the steppe zone with a very dry climate;Table 7

    Table 7 Accuracy measures for the hybrid map.
  • In Russia, due to its large territory, there are abandoned lands in the forest zone with high fragmentation, and there are abandoned fields in the steppe.

Figure 5 presents the area estimates for abandoned land (95% confidence interval). We calculated the country statistics based on official country reports as the difference between the arable and cultivated area4249. The adjusted areas were calculated based on the confusion matrices (Supplementary Table S2, Supplementary Table S3, Supplementary Table S4, Supplementary Table S5, Supplementary Table S6, Supplementary Table S7) by following the procedure set out in Olofsson et al.41 In Figure 5, for Kazakhstan, the error bar from the map is not within the official estimates so it indicates underestimation by the official statistics. The overall error bar is also outside the total abandoned land area, indicating that the overall abandoned land area in the fSU is underestimated by the statistics. In comparing the estimates across the fSU countries, the widespread underestimation of abandoned land in the official national statistics due to deliberate manipulation for administrative reasons e.g. ref. 50 should be considered.

Figure 5
figure 5

Area estimates for abandoned land.

In addition to the accuracy assessment presented above, we compared the hybrid map produced here with the latest ESA CCI land cover maps51 covering the period 1992–2012. To undertake this comparison, we first generated a derivative ESA CCI product containing information on cropland gain and loss over the period 1992–2012. From this derivative product, the cropland loss and gain for fSU countries was estimated to be approximately 2.3 and 5.4%, respectively. Thus the overall trend based on ESA CCI is cropland expansion (especially in Kazakhstan) rather than an increase in the area of abandoned land. This is contrary to what has been published in all other studies1,17,21 and according to the official statistics reported by each country.

Usage Notes

The hybrid map reported in this paper represents a novel arable and abandoned land product, which covers more than 90% of all agricultural lands across the fSU, and has many potential uses. For example, the map can be used for assessment of the biogeochemical cycles (e.g., carbon dynamic) on abandoned and cultivated fields1,8,52,53, for the analysis of the patterns and proximate causes of greening (vegetation recovery) and browning (vegetation degradation)5457, for investigation into the drivers of land abandonment and the implications for ecosystem services and biodiversity. The product can be used at the original resolution (10 arc-second with pixel size of approximate 4–7 ha) or aggregated to a coarser resolution such as 1 to 10 km. We envision a good alignment with and improvement of global land-use data sets such as HYDE 3.1 (ref. 58), KK11 (ref. 59), and the SAGE Global Land-Use Database60.

The hybrid map can serve as an input to a regional or country level analysis since we have achieved reasonable accuracies.

Additional information

How to cite this article: Lesiv, M. et al. Spatial distribution of arable and abandoned land across former Soviet Union countries. Sci. Data 5:180056 doi: 10.1038/sdata.2018.56 (2018).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.