A dataset of 137Cs activity concentration and inventory in forests contaminated by the Fukushima accident

The majority of the area contaminated by the Fukushima Daiichi Nuclear Power Plant accident is covered with forests. We developed a dataset for radiocaesium (137Cs) in trees, soil, and mushrooms measured at numerous forest sites. The 137Cs activity concentration and inventory data reported in scientific journal papers written in English and Japanese, governmental reports, and governmental monitoring data on the web were collated. The ancillary information describing the forest stands were also collated, and further environmental information (e.g. climate) was derived from the other databases using longitude and latitude coordinates of the sampling locations. The database contains 8593, 4105, and 3189 entries of activity concentration data for trees, soil, and mushrooms, and 471 and 3521 entries of inventory data for trees and soil, respectively, which were collected from 2011 to 2017, and covers the entire Fukushima prefecture. The data can be used to document and understand the spatio-temporal dynamics of radiocaesium in the affected region and to aid the development and validation of models of radiocaesium dynamics in contaminated forests. Measurement(s) activity (of a radionuclide) • Cesium Cs 137 Technology Type(s) digital curation Factor Type(s) geographic location Sample Characteristic - Organism Cryptomeria japonica • Chamaecyparis obtusa • Pinus densiflora • Quercus serrata Sample Characteristic - Environment temperate evergreen needleleaf forest • temperate deciduous broadleaf forest • leaf • wood • litter layer • forest soil • mushroom (food source) • bark • pollen Sample Characteristic - Location Japan • Fukushima Prefecture Measurement(s) activity (of a radionuclide) • Cesium Cs 137 Technology Type(s) digital curation Factor Type(s) geographic location Sample Characteristic - Organism Cryptomeria japonica • Chamaecyparis obtusa • Pinus densiflora • Quercus serrata Sample Characteristic - Environment temperate evergreen needleleaf forest • temperate deciduous broadleaf forest • leaf • wood • litter layer • forest soil • mushroom (food source) • bark • pollen Sample Characteristic - Location Japan • Fukushima Prefecture Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13166462

In addition, trees consist of various physiologically different parts, such as leave, branches, bark, wood, and roots; furthermore, forest systems include not only trees, but also soil surface organic layers, mineral soils, mushrooms, and various animals. The wood of trees is the most important part of the ecosystem in terms of forestry products, but to trace the dynamics of radiocaesium within forests, tracking the radiocaesium in major functional compartments within forests is essential.
Therefore, to document the radioactive contamination of forests, and to capture the more representative situations of forest contamination, a comprehensive dataset is essential [8][9][10][11][12] . To that end, in this database we aimed to provide a database that can be used to document changes in the spatio-temporal distributions of radiocaesium in the affected region, to better understand the fate of radiocaesium dynamics in forests, and to aid the development and validation of models of radiocaesium dynamics in forests.
We collated the 137 Cs activity concentration and inventory data ( 137 Cs activity per unit ground area) reported in journal papers written in English or in Japanese, governmental reports, and monitoring data on the web provided by the government. We further collated the ancillary site information from the source and those derived from the other databases linked with location.
The database contains 8593, 4105, and 3189 entries of activity concentration data for trees, soil, and mushrooms, and 471 and 3521 entries of inventory data for trees and soil, respectively, which were observed from 2011 to 2017, and in particular intensively covers the entire Fukushima prefecture. The data for mushrooms were taken across the wide range of eastern Japan including the Fukushima prefecture. As for tree species, Sugi cedar (Cryptomeria japonica), which is the most important plantation tree species in Japan, was most intensively investigated, and data for Hinoki cypress (Chamaecyparis obtusa), pine (Pinus densiflora), and oak (mainly Quercus serrata) were abundant too.
This database is a precious resource to refine our knowledge on biogeochemical cycling of radiocaesium within forests, and also represents a good basis to bridge disciplines and to develop interdisciplinary environmental studies.

Methods
We collected all related research studies in the peer-reviewed scientific literature using the Web of Science (https:// www.webofknowledge.com/), and J-Stage (https://www.jstage.jst.go.jp/). The search terms we used were "Fukushima", "forests", and "cesium/caesium/radiocesium/radiocaesium". The studies written in English were searched using the Web of Science, and those in Japanese were searched using the J-Stage. We further collected reports of Japan's governmental and local governmental projects conducted in the Fukushima prefecture. We collated radiocaesium activity concentration and inventory data together with the ancillary data of forest stand and location of the study sites. As for the mushroom data, we collated data from the governmental web pages for publishing monitoring data. The original values were decay-corrected to the sampling date where provided. When the values were lower than detection limits, the values were shown with the inequality sign. When only the total radioactivity of 134 Cs and 137 Cs were reported, we estimated the radioactivity of 137 Cs by decay correction with the assumption that the ratio of 134 Cs: 137 Cs on 11 March 2011 was 1:1. Data for animals are not the major target of the database, but when we found a reference, we incorporated these data in the database. When data were shown only in plots, we extracted values by measuring points/bars in the plots using software (GSYS2.4, JCPRG, Japan). When longitude and latitude coordinates were provided, we combined ancillary data using the other databases: distance from the power plant, air-borne survey based air dose rate, 137 Cs, and 134 Cs deposition 13 , annual mean air temperature, precipitation, elevation, and soil type 14 . For mushroom records, ecosystem types (e.g. litter/wood decomposing) were added.
The database consists of three separate files: 1) a data file, which contains radioactivity and ancillary data, 2) a field-description file, which explains the contents of the data file and the units of data, and 3) a reference list file, which contains the source number and the details of the reference. When the source has only a Japanese title, we translated the title into English, and included both titles in the reference list file.

Data records
The dataset, in spreadsheet format (Microsoft Excel), can be found in the ZENODO repository with the title " 137 Cs in forest ecosystems contaminated by the Fukushima Daiichi Nuclear Power Plant Accident" 15  www.nature.com/scientificdata www.nature.com/scientificdata/ 3992 inventory data (3668 data entries have both activity concentration and inventory). The sampling years of the records ranged from 2011 to 2017, and the sampling sites cover the entire Fukushima prefecture (Fig. 1). The data for mushrooms were taken across the wide range of eastern Japan, 14 prefectures including Fukushima prefecture. For mushrooms, the data have no longitude and latitude coordinates but only the municipality's name for sampling locations. The database contains 8593, 4105, and 3189 entries of activity data for trees, soil, and mushrooms, and 471 and 3521 entries of inventory data for trees and soil, respectively. As for tree species, Sugi cedar (Cryptomeria japonica), which is the most important plantation tree species in Japan, was most intensively investigated, and data for Hinoki cypress (Chamaecyparis obtusa), pine (Pinus densiflora), and oak (mainly Quercus serrata) were rich ( Table 2).  13 . Please note that longitude and latitude coordinates for a part of data entries are not reported in the source even for trees and soil, and hence not all sampling sites are shown on the map. Leaf  1081  194  240  118  350  1983   Branch  251  109  149  172  105  786   Bark  360  117  185  151  31  844   Inner bark  82  36  22  28  20  188   Outer bark  416  120  252  38  21  847   Wood  267  85  126  45  5  528   Heartwood  609  182  328  193  34  1346   Sapwood  621  190  344  215  48  1418   Pollen  115  0  0  0  0  115   Other  290  2  3  216  27  538   Total  4092  1035  1649  1176  641  8593   Table 2. Summary of counts for activity concentration data entry for major tree parts among tree species.

Technical Validation
The data validation was conducted in two ways. Firstly, all data entries were double checked by a researcher who had not made the primary data entry. Secondly, for data with longitude and latitude coordinates, mistakes in data entry were detected by plotting data with total deposition ratio: the total deposition information was derived using the location information. In general, activity concentrations are positively correlated with total deposition; hence, we identified the outliers visually, and checked the data entries for outliers with the original source and modified the data entry errors. Obvious errors of longitude and latitude coordinates were validated by plotting the sampling location on a map. The mushroom data on the web contained a certain number of typographic errors and inconsistencies in names, which we cleaned up manually.

Usage Notes
Some data reported in different journal papers were often overlapping (ie. reported twice). The master data records and the duplications were identified in the "Flag_duplication". To simply remove the duplication and extract the master data only, use the data with no flag. The sources of duplicated data are shown in "Related_references", so the master data can be changed depending on the purpose of the study. The litter and soil data from the same soil vertical profile were identified in the "Soil_profile_number". For example, to sum the total inventory of a soil profile or draw a vertical distribution in soil, you can use this identifier.
www.nature.com/scientificdata www.nature.com/scientificdata/ Reprints and permissions information is available at www.nature.com/reprints.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.