Six years of ecosystem-atmosphere greenhouse gas fluxes measured in a sub-boreal forest

Carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O) are the greenhouse gases largely responsible for anthropogenic climate change. Natural plant and microbial metabolic processes play a major role in the global atmospheric budget of each. We have been studying ecosystem-atmosphere trace gas exchange at a sub-boreal forest in the northeastern United States for over two decades. Historically our emphasis was on turbulent fluxes of CO2 and water vapor. In 2012 we embarked on an expanded campaign to also measure CH4 and N2O. Here we present continuous tower-based measurements of the ecosystem-atmosphere exchange of CO2 and CH4, recorded over the period 2012–2018 and reported at a 30-minute time step. Additionally, we describe a five-year (2012–2016) dataset of chamber-based measurements of soil fluxes of CO2, CH4, and N2O (2013–2016 only), conducted each year from May to November. These data can be used for process studies, for biogeochemical and land surface model validation and benchmarking, and for regional-to-global upscaling and budgeting analyses.


Background & Summary
Increases in atmospheric concentrations of carbon dioxide (CO 2 ), methane (CH 4 ), and nitrous oxide (N 2 O) are driving the radiative forcing of climate that has occurred since 1800 1 . While these increases are predominantly the result of human activities, significant exchanges of these gases occur naturally between terrestrial ecosystems and the atmosphere. For example, global photosynthetic uptake by terrestrial ecosystems (≈123 ± 8 Pg C y-1 as CO 2 , ref. 2 ) is a massive flux, but at annual time scales under current climate conditions this uptake is largely offset by a comparable efflux of respiratory carbon back to the atmosphere. By comparison, anthropogenic emissions of carbon to the atmosphere (9.5 ± 0.5 Pg C y-1 as CO 2 , ref. 3 ) are not offset by existing sinks. The increase in atmospheric CH 4 during the industrial era-from 823 ppb in 1841 4 to over 1800 ppb at present 5,6 -is attributed to both fossil fuel emissions and microbial emissions 5 . Importantly, soils can be either a CH 4 sink or source. Anaerobic CH 4 -emitting microbes (methanogenic archaea) are commonly found in wetland environments, while aerobic CH 4 -consuming microbes (methanotrophic bacteria) are often found in upland soils. Soil processes are the dominant source of N 2 O, with fluxes from natural systems accounting for about 35% of global emissions 7 . N 2 O can be produced by microbes under both anaerobic (via denitrification) and aerobic (via nitrification) conditions 8 , although the bulk of N 2 O production occurs in waterlogged soils 9 . Agricultural practices (accounting for 25% of global emissions), fossil fuel combustion, and industrial activities further contribute to N 2 O emissions 7 . Reports of N 2 O consumption by soil microbes have been controversial 10,11 . Thus for each of CO 2 , CH 4 , and N 2 O, natural biological processes play an important role in the global budget.
Land-atmosphere fluxes and atmospheric concentrations of CH 4 and N 2 O are orders of magnitude smaller than those of CO 2 . The atmospheric lifetimes of CH 4 (12 y) and N 2 O (114 y) are also shorter than that of CO 2 (5-200 y, see ref. 1 ). But, as greenhouse gases, CH 4 and N 2 O are particularly important because of their much higher radiative forcing effect 1 . This motivates efforts to better understand the spatial and temporal patterns of land-atmosphere CH 4 and N 2 O flux, and the biotic and abiotic factors controlling these patterns.
Here, we describe a six-year data set characterizing greenhouse gas fluxes at Howland Forest, Maine 12 . Vegetation at Howland, which is located within the boreal-northern hardwood transition zone, is dominated by the conifers red spruce and eastern hemlock 13 . The climate is cold and continental, although summers are warm. Soils are generally Spodosols with high organic matter content 14 .
Tower-based measurements consist of ecosystem-atmosphere turbulent fluxes of CO 2 , CH 4 , H 2 O (latent heat), and sensible heat, made using the eddy covariance method and reported at 30-minute temporal resolution. While long-term CO 2 and H 2 O flux measurements are now being conducted at hundreds of sites around the world 15 (some of these records-including data from Howland 13 -extend 20 years or more), long-term tower-based measurements of CH 4 fluxes have been made at comparatively few sites, and generally in the last decade. Beyond our tower-based CH 4 measurements at Howland 16 , only a handful of other studies have been published for temperate [17][18][19] and tropical 20,21 forests. Much more attention has been paid to wetland systems [22][23][24][25][26] , which are generally strong sources of CH 4 . Previous analysis of our data has indicated that at an annual time step, Howland Forest switches from a weak CH 4 source to a weak CH 4 sink depending on hydrologic conditions during late summer 16 .
We have also conducted measurements of soil-atmosphere greenhouse gas fluxes using automated chamber-based methods 27 A subset of the dataset 29 described here is available through the AmeriFlux data portal 30 . We have two goals in describing and distributing a more complete dataset via Figshare. First, we aim to document tower and chamber flux measurements (e.g., instruments, processing, and QC) that had not yet been fully described in our previous papers 12,16 . Second, we are making data publicly available that cannot otherwise be handled or distributed through the current AmeriFlux data distribution system. This includes a variety of important variables output through the flux processing software (variances and covariances, flux uncertainties, spectral correction factors, and trace gas time lags) as well as all of the chamber data.
These data will be of use for investigations into the factors controlling greenhouse gas fluxes; for validation of ecosystem, biogeochemical, and earth system models; and for upscaling and budgeting analyses. While the CH 4 and N 2 O fluxes from Howland are small compared to other systems, we argue that to accurately estimate global budgets, it is as important to know where the fluxes are small as it is to know where they are large.

Study site.
Research was conducted at the Howland Forest AmeriFlux site located ( Fig. 1) about 35 miles north of Bangor, Maine, USA (45.2041°N 68.7402°W, elevation 60 m above sea level) on forestland owned by the Northeast Wilderness Trust. The site sits at the southern ecotone of the North American boreal spruce-fir zone. Red spruce (Picea rubens Sarg.) and eastern hemlock (Tsuga canadensis (L.) Carr.) together account for about 70% of basal area, with other conifers (northern white cedar, Thuja occidentalis; balsam fir, Abies balsamea; and white pine, Pinus strobus) together accounting for 20% of basal area. Hardwoods, including red maple (Acer rubrum L.) and paper birch (Betula papyrifera Marsh.), together account for 10% of basal area 13 . Seasonality in leaf area index of the evergreen canopy is minimal; peak LAI during the growing season is about 5 m 2 m −2 . The undisturbed stand (mean age ≈120 y, maximum age ≈225 y; basal area 48 ± 17 m 2 ha −1 ; canopy height ≈20 m) surrounding the "main tower" (one of four instrumented research towers at the site) is atypical of the regional landscape, where intensive forestry activities have taken place for over a century. Topography is flat to gently rolling. Soils range from well drained to poorly drained. Mean annual temperature is 6.1 °C and mean annual precipitation is 990 mm. The seasonal patterns of variation in environmental factors, phenology, and ecosystem-atmosphere fluxes are illustrated in Fig. 2. Climate, soils, and vegetation at the site are described in greater detail in earlier publications [12][13][14] and documented in the AmeriFlux BADM (Biological, Ancillary, Disturbance and Metadata) file for this site (see "Additional Files" in the Data Records section, below). More recent publications comprehensively document the forest stand composition, structure, and growth 31,32 in the vicinity of the main tower.  (1) snow melt, (2) last frost, (3) budburst of deciduous trees, (4) budburst of evergreen trees, (5) deciduous trees drop leaves, (6) first frost, (7) first persistent snow; (b) half-hourly air temperature; (c) half-hourly soil temperature; (d) daily canopy greenness, derived from PhenoCam imagery; (e) half-hourly net ecosystem exchange of CO 2 ; (f) half-hourly net ecosystem exchange of CH 4 ; (g) half-hourly sensible heat flux (H); (h) half-hourly latent heat flux (LE). capabilities to include eddy covariance measurements of CH 4 fluxes. An improved gas analyser for CH 4 fluxes was installed in 2012 16 . Here we describe measurements from that instrument over the period 2012 to 2018.
Fluxes were measured at a height of 31 m with an instrument system consisting of a model SAT-211/3 K 3-axis sonic anemometer (Applied Technologies Inc., Longmont, CO, USA) and a fast-response CH 4 /CO 2 /H 2 O cavity ring-down spectrometer (model G2311-f, Picarro Inc., Santa Clara, CA, USA). Sampled air was pulled from the top of the tower through ~46 m of 4.8 mm (inner diameter) LLDPE (U.S. Plastics Corp., Lima, OH, USA) tubing (replaced annually), sheathed in flexible PVC pipe to minimize temperature fluctuations, using a vacuum pump (model MD4-NT, Vacuubrand GmbH, Wertheim, Germany) to maintain low cavity pressure and a flow rate ≥7 standard litres per minute. The distance between the air inlet and the sonic anemometer was less than 30 cm. All data, including concentrations of CH 4 and CO 2 reported as dry air mole fractions (mixing ratios), were recorded at 5 Hz on a data logger (model CR1000, Campbell Scientific Inc., Logan, UT, USA  35 . Turbulent fluxes calculated include sensible (H) and latent (LE) heat fluxes, as well as fluxes of CO 2 (CO 2 _flux) and CH 4 (CH4_flux). The custom EddyPro settings used are summarized in Table 1. We processed the data in two batches: June 2012-June 2015 and June 2015-June 2018. The resulting files were concatenated in chronological order. The full EddyPro output file, with no filtering, is included here.
Filtered tower-based flux measurements. We used the EddyPro output file to generate a filtered data set, also included here, which follows AmeriFux standard formats and which is recommended for most applications. Following methods we have used at Howland for over 20 years 12,13 , we created a 14-bit QC flag that assessed each half-hour against a range of criteria we have found useful (Table 2). These include thresholds for windspeed, sonic anemometer temperature "spikes", and sensor variance (insufficient variance likely indicating no turbulence or failed pump, excess variance indicating material on sonic transducers, system leaks or analyser malfunction). If a condition was true, the appropriate bit of the Howland QC flag was set to 1.
In the filtered data set, we excluded turbulent fluxes if any relevant bit of the Howland QC flag was set to 1. We applied the flags assuming a hierarchy of flux measurements, i.e. if H was flagged, then LE was also flagged; if LE was flagged, then CO2_flux was also flagged; and if CO2_flux was flagged, then CH4_flux was also flagged. Thus bits 1 through 7 were applied to H, bits 1 through 9 were applied to LE, bits 1 through 12 were applied to CO2_flux, and bits 1 through and 14 were applied to CH4_flux. Additionally, if H was flagged, then all other measurements derived from the sonic anemometer-including sonic temperature, Tau, u*, and wind speed and direction-were also flagged. Flagged values were set to −9999 in the filtered data set.
We next used a simple empirically-based outlier detection method to identify the small number of remaining flux values that were statistically inconsistent with other measurements made under similar environmental conditions. To do this, we used a regression approach that accounted for covariation of environmental factors, and phenological effects associated with the time of year. We then calculated the interquartile range (IQR = Q 3 − Q 1 , where Q 3 and Q 1 are the upper and lower quartiles, respectively) of the regression residuals, separately according to day vs. night and time of year. We conservatively excluded fluxes that were more than 6*IQR above Q 3 or below Q 1 . Similar methods are commonly used in the literature, but a more aggressive threshold (e.g. 3*IQR) is typically used. Based on our previous work, we recognize that flux measurement errors have a leptokurtic distribution 36 , and large measurement errors are thus more likely than if errors followed a Gaussian distribution. Our goal was not complete "cleaning" of the data set, which might have resulted in discarding of valid measurements, but rather to identify the most extreme outliers. We use the micrometeorological sign convention: flux into the ecosystem (e.g. photosynthetic CO 2 uptake, CH 4 consumption), is defined as a negative flux, whereas flux from the ecosystem to the atmosphere is a positive flux. Note that the flux units for both CO 2 and CH 4 are μmol m −2 s −1 in the unfiltered data set, but in the filtered data set-following the AmeriFlux convention-the flux units for CH 4 are nmol m −2 s −1 . www.nature.com/scientificdata www.nature.com/scientificdata/ Environmental measurements. We have for many years conducted measurements, from the main Howland tower, of key environmental and meteorological variables that are relevant to interpretation and modelling of ecosystem-atmosphere flux data. Like the tower fluxes, these data are reported at a 30-minute temporal resolution, which typically represents the mean of higher-frequency instantaneous measurements. For example, solar radiation measurements are taken every 15 s, but only the 30-minute mean is logged.

EddyPro option Setting
Environmental and meteorological measurements reported here include air temperature (shielded, ventilated platinum resistance thermometer), solar radiation (photosynthetic photon flux density, PPFD; model PAR lite quantum sensor, Kipp & Zonen, Delft, the Netherlands), net radiation (model CNR-4, Kipp & Zonen, Delft, the Netherlands), precipitation (heated tipping bucket rain gage, model TR-525; Texas Electronics, Dallas, TX, USA), and air pressure (model PTB100A analog barometer; Vaisala, Vantaa, Finland), all of which are measured at the top of the tower. Additionally, soil temperature at 10 cm depth (thermocouple) and water table depth (submersible pressure transducer model WL400; Global Water Instrumentation, College Station, TX, USA) have been measured 30 m from the base of the tower.
Through intercomparison of the PPFD, shortwave, and longwave radiation measurements at the main Howland tower (US-Ho1) together with simultaneous measurements made at the west Howland tower (US-Ho2; located 800 m away), and modeled clear-sky incident shortwave fluxes 37 , we screened the radiation data sets for extreme outliers, which could be attributed to instrument malfunction and snow on sensors. The number of half-hourly data points excluded in this way was generally very small, and in all cases well under 1% of the measured values (Table 3). Differences between incoming longwave radiation measured at the main and at the west tower (LW_IN_1_1_1 and LW_IN_2_1_1) are attributed to the lack of a heater/blower on the west tower instrument. A scatter plot of the outgoing shortwave radiation measurements at the main and west tower (SW_ OUT_1_1_1 and SW_OUT_2_1_1) reveals an interesting nonlinear ("banana") shape which implies some differences in surface reflectance as a function of solar elevation. The maximum measured difference between the two sensors is approximately 20 W m −2 . Because the two shortwave sensors are on different towers, some differences are to be expected: the main tower is a walk-up tower, with a larger canopy hole, while the west tower is a triangular mast with a smaller canopy hole. The influence of the tower itself may be larger at the main tower. While the forest composition and structure is similar between the two towers, it is not identical. There may be differences in shadowing, canopy continuity and ground view (including snow on ground), and even the dominant species that are most prominent in the field of view of the instrument. Individually or together, these differences are likely sufficient to explain the observed difference in reflected shortwave radiation.
Chamber measurements. An automated, chamber-based system was used to quantify soil CO 2 , CH 4 and N 2 O fluxes within the footprint of the main Howland tower (Fig. 1). The system, and details of sampling methods and data processing, are described in detail in previous publications 27,28 . Briefly, automated chambers (each 30.5 cm in diameter; between measurements the chamber top was lifted, using a pneumatic piston, off a PVC collar permanently inserted into the soil surface) were installed in one of three topographic positions, (1) upland: forest-dominated, and characterized by well-drained soils; (2) transitional: sphagnum-dominated, and characterized by sporadic inundation; and (3) Table 2. Interpretation of custom quality flag descriptor and associated criteria. Flags are encoded as 14 bit values in binary notation, where for each bit, 1 = true and 0 = false for the criteria above. For convenience, flags are reported in both binary and decimal notation. For example, if u < 0.5 m s −1 (bit 2 = 1; binary = 00000000000010, decimal value = 2), sonic anemometer T variance > 2.5 (bit 6 = 1; binary value = 00000000100000, decimal value = 32), and Picarro CO 2 variance < 0.015 (bit 10 = 1, binary value = 00001000000000, decimal value = 512), the quality flag descriptor would be reported as a binary value of 00001000100010, and a decimal value of 2 + 32 + 512 = 546. The "frequency" column reports the proportion of half-hourly periods (of ≈106,000 halfhourly periods in the 6-year data set reported here) receiving each quality flag. In total, slightly more than onethird (35.5%) of all half-hourly periods had a non-zero 14 bit quality flag.
www.nature.com/scientificdata www.nature.com/scientificdata/ 1 m deep, and characterized by continuous inundation. (The upland plots were also the site of a trenching experiment that was initiated in late fall of 2012 to permit partitioning of soil respiration to autotrophic and heterotrophic components 38 . Root exclusion trenches 1 m deep were dug around three 5 m × 5 m plots; the trenches were then lined with plastic sheeting and backfilled. One automated chamber was placed in each of the trenched plots and three chambers were left in their original upland positions as controls. Measurement of the trenched plots occurred from 2012-2015, and these data are included here).
Where the chambers were installed, and what trace gas fluxes were measured, varied among years. Deployments are summarized by year and topographic position in Table 4, with chamber specifics in Table 5. Because of differences among years in the measurement objectives and number of chambers deployed, the exact frequency at which a specific chamber was sampled may have varied over time.
From 2012 to 2016, soil fluxes from each chamber were measured approximately once per hour, 24 h per day, during the snow-free period when vegetation was active (May to November). Different gas analysers were deployed depending on the measurement objectives. To measure soil CO 2 fluxes, we used an infrared gas analyser (model 6252; LI-COR Biosciences, Lincoln, NE, USA); to measure soil CO 2 and CH 4 fluxes, we used a cavity ring-down spectrometer (model G2121-i; Picarro Inc., Santa Clara, CA, USA); to measure soil CH 4 and N 2 O fluxes, we used a quantum cascade laser (TILDAS CS, Aerodyne Research Inc., Billerica, MA, USA). To measure soil CO 2 , CH 4 , and N 2 O fluxes, we used the infrared gas analyser and the quantum cascade laser in series 28 .
Trace gas fluxes were determined using chamber headspace concentrations measured (1 Hz) over a 4-minute period, beginning 60 s and ending 300 s after the chamber top closed. Thus, each measurement sequence required 5 minutes. We note that noise in the 1 Hz concentration data output by the analyser will propagate directly to uncertainty in the calculated flux, particularly when the flux is small and the noise is relatively large in comparison to the change in headspace concentration. Thus, CH 4 fluxes calculated from the quantum cascade laser measurements have better precision than fluxes calculated from the cavity ring-down spectrometer, and CO 2 fluxes have better precision than either the CH 4 or N 2 O fluxes.
Fluxes were calculated from the linear regression of change in headspace concentration over time and were scaled up from the collar area, corrected for atmospheric pressure and temperature. Units for the fluxes are as follows: CO 2 flux, μmol CO 2 m −2 s −1 ( = 43.2 mg C-CO 2 m −2 hr −1 ); CH 4     www.nature.com/scientificdata www.nature.com/scientificdata/ Following our standard soil respiration QC procedures 27 , measured CO 2 fluxes were excluded if the correlation between headspace CO 2 concentration and time was insufficiently high (R 2 < 0.9), on the assumption that a poor correlation (nonlinear or noisy) likely indicates that the chamber lid did not close properly. All soil fluxes have been filtered to remove data obtained when the measurement system was compromised, e.g. power or instrument failure, and during periods of instrument calibration or testing. As with the eddy covariance measurements, our sign convention is that a negative flux indicates uptake by the soil (i.e., CH 4 consumption is a negative flux), and a positive flux indicates emission from the soil (i.e., respiration of CO 2 is a positive flux).

Data Records
The data set presented here, which is available within Figshare 29 and released under a CC-BY 4.0 license, consists of (1) the "tower flux" data files, which includes three files derived from our primary gas analyzer (Picarro CRDS) as well a fourth file derived from our secondary (backup) gas analyzer (LI-COR IRGA); (2) a "chamber flux" data file; and (3)  For ease of use, we have divided the tower flux data into four separate files, as follows: (1) Unfiltered EddyPro output. This file contains the processed but unfiltered tower fluxes (calculated using data from the Picarro CRDS), as output by the EddyPro software at a 30 minute time-step, as well as the associated enviro-meteorological data, and is named US-Ho1_HH_201206060000_201806302330_EP.csv. The columns of this data file are described in Online-only  www.nature.com/scientificdata www.nature.com/scientificdata/ or fails standard Howland QC criteria; 2, outlier more than 6*IQR below Q 1 ; and 3, outlier more than 6*IQR above Q 3 . The summary flag has been applied to the turbulent fluxes reported in the filtered flux file; measured fluxes are reported if the summary flag equals zero, and are set to −9999 if the summary flag equals 1, 2 or 3. Additionally, a summary flag value of 4 is used to indicate suspect nocturnal data (based on a u* threshold; see Usage Notes), although following AmeriFlux data standards we have not removed these measurements from the filtered data set. The QC and outlier flags file is named US-Ho1_ HH_201206060000_201806302330_QC.csv. The columns of this data file are described in Table 6. This file is distributed through Figshare as the data it contains cannot be distributed via AmeriFlux.
(3) Filtered half-hourly AmeriFlux-format dataset. This file contains the filtered tower fluxes, as well as associated enviro-meteorological data, at a 30 minute time step, formatted according to AmeriFlux standards. Following AmeriFlux naming conventions, the AmeriFlux-format tower fluxes dataset is named US-Ho1_HH_201206060000_201806302330.csv. The columns of this file are described in Table 7. This file is distributed through Figshare, and an identical file has been uploaded to the AmeriFlux data archive, where it has undergone the standard AmeriFlux checks for data quality and consistency, and where it is available as part of the larger US-Ho1 data record (since 1996) 30 . (4) Unfiltered EddyPro output for a second gas analyser. This file contains the processed but unfiltered tower fluxes (calculated using data from the LI-COR Li-7200 IRGA; note that this instrument does not measure CH 4 ), as output by the EddyPro software at a 30 minute time-step, and is named US-Ho1_HH_201206060000_201806302330_EP LI-COR.csv. The columns of this data file are described in Online-only Table 1. This dataset is only distributed through Figshare; fluxes calculated from the LI-COR analyser have not been uploaded to AmeriFlux because of concerns about system performance in 2018. Fluxes from this data file were used in the technical validation analyses described below.
Chamber fluxes. The "chamber flux" data file contains measurements of soil fluxes of CO 2 , CH 4 , and N 2 O, with measurements from each chamber reported approximately hourly during the growing season (2012-2016). The chamber fluxes file is named US-Ho1_CMB_201201010000_201701010000.csv, and the columns of the data file are described in Table 8. This file is distributed through Figshare. An identical file has been uploaded to the AmeriFlux archive, but it contains data that cannot be distributed through AmeriFlux.
Additional files. The configuration and metadata files used for the Eddy Pro processing described here (files:

Technical Validation
Site overview. The forest in the vicinity of the main Howland tower is nearly ideal from the perspective of making tower-based flux measurements over tall vegetation; forest cover is extensive and homogeneous, and the topography is generally flat 13 . As one of the longest-running AmeriFlux sites, the eddy covariance flux measurements at Howland have been carefully scrutinized over the last two decades. For example, the environmental and flux measurements from the main Howland tower have been regularly evaluated against data recorded by the AmeriFlux Portable Eddy Covariance System, which was most recently deployed adjacent to our own instrumentation for a 10-day period in the summer of 2016. Additionally, since 1998 environmental and flux measurements  Table 2. Zero values indicate good data.

HowQC_Dec
Sum of Howland QC flags (bits 14 through 1), in decimal notation HowQC_Bin Sum of Howland QC flags, expressed in 14-bit binary notation.

H_HowQC
Howland QC flag for H and other quantities derived from sonic anemometer, set to 0 (good data) if the sum of QC flag bits 1 through 7 equals 0, and 1 otherwise (bad data).

LE_HowQC
Howland QC flag for LE, calculated based on the sum of QC flag bits 1 through 9 CO2_flux_HowQC Howland QC flag for CO2_flux, calculated based on the sum of QC flag bits 1 through 12

CH4_flux_HowQC
Howland QC flag for CH42_flux, calculated based on the sum of QC flag bits 1 through 14 H_flag, LE_flag, CO2_flux_flag, CH4_flux_flag Summary flag for filtering H, LE, CO2_flux and CH4_flux; 0 = valid measurement, 1 = missing or fails standard Howland QC criteria, 2 = outlier more than 6*IQR below Q 1 , and 3 = outlier more than 6*IQR above   www.nature.com/scientificdata www.nature.com/scientificdata/ have also been conducted at the "west" Howland tower, located about 800 m to the north-west of the main tower, in an extensive forest stand with composition and structure similar to that surrounding the main tower. Analysis of the coherence spectra for environmental variables and fluxes recorded on the two towers has shown excellent agreement between the two measurement systems over time scales of hours to days, while at the annual time step, the net ecosystem exchange of CO 2 measured at the two towers was found to differ by less than 6% 12 . These analyses point to the high quality of eddy covariance flux measurements at Howland Forest, and the representativeness of the main tower in relation to the immediately surrounding landscape. We note also that data from the main and west towers were used to develop a novel method of assessing the random uncertainty in 30-minute CO 2 , H 2 O and energy fluxes 12,40 , which has then been applied to estimate uncertainties in annual ecosystem C budgets 41,42 . Thus, in general the eddy covariance fluxes measured at Howland are known to be of high quality, with well-characterized uncertainties.
Here we conduct three additional analyses to further assess the technical quality of the tower-based measurements. First, we compare LE and CO 2 fluxes calculated using H 2 O and CO 2 concentrations measured with our Picarro CRDS against those calculated using concentrations measured simultaneously with a LI-COR IRGA. Second, we compare the long-term patterns in CH 4 concentration measured with our analyser against independent atmospheric CH 4 concentration measurements from two climate monitoring observatory stations. Finally, we conduct an analysis of the quality control flags and estimated random uncertainties in the CH 4 flux measurements.
Comparison of fluxes calculated using independent CO 2 mixing ratio measurements. Since we installed the Picarro CRDS at Howland in 2012, we have operated it in parallel with a co-deployed fast response closed-path CO 2 /H 2 O infrared gas analyser (IRGA model Li-7200, Li-Cor Inc., Lincoln, NE) for redundancy and quality assurance. (Prior to 2012, we exclusively used closed-path LI-COR IRGAs for flux measurements on the main Howland tower.) The two instruments have independent air sampling systems (tubing, pump, and flow control), although air inlets are located adjacent to each other at the top of the tower. For flux calculations, orthogonal wind components from a single sonic anemometer are used in conjunction with the H 2 O and CO 2 concentrations (for CO 2 , dry air mole fraction) data reported from each analyser. The level of agreement between the fluxes calculated from these two systems (see Fig. 3 for a comparison using 2012 data; see Table 9 for statistics for all years 2012-2018; see also ref. 16 ) gives us confidence in the overall quality of the fluxes (specifically LE and CO 2 _flux, and by extension CH 4 _flux) measured using the Picarro CRDS. While the agreement between the two analyzers is not as good in 2018 compared to the previous years, we attribute this to known issues with the LI-COR-based system in that year, including analyser calibration and pump/flow controller problems which do not affect the Picarro measurements.
Long-term assessment of CH 4 analyser performance. For Howland, we calculated the monthly mean CH 4 concentration (dry air mole fraction) from the 30-minute mid-day (10 am to 2 pm, local standard time) mean values. From the approximately 18,000 mid-day half-hourly data points recorded between 2012 and 2018, we excluded from the calculation 323 half-hourly measurements where the measured CH 4 concentration was greater than 2600 ppb (61% of these high measurements occurred during a brief period late in 2014), and 8 half-hourly measurements where the measured CH 4 concentration was less than 1500 ppb. There were 1075 missing data points when the CH 4 concentration was not recorded due to power or instrument failure. Within each monthly period, the standard deviation of the mean half-hourly CH 4 concentrations had a mean value of 20 ppb, and with ≈240 measurements averaged each month, the standard error of the mean was in almost all cases less than 2 ppb. The monthly median tended to be somewhat lower (by 4 ± 3 ppb) than the monthly mean, but the temporal patterns were essentially identical.
In Fig. 4, we compare the Howland (Maine) data with data from Mauna Loa (Hawaii) and Barrow (Alaska), where ongoing long-term atmospheric CH 4 concentration measurements are maintained by researchers from the National Oceanic and Atmospheric Administration (NOAA) 43,44 . For the NOAA data, sub-hourly measurements have been similarly filtered for outliers (<1500 ppb or >2600 ppb), averaged to hourly values, and then screened  www.nature.com/scientificdata www.nature.com/scientificdata/ to distinguish samples of regionally representative air. These are then filtered using a rule-based editing algorithm to exclude measurements obtained when the analytical instrument was not working properly. The NOAA instruments (an automated gas chromatograph using flame ionization detection at Mauna Loa, and since 2013 a laser-based optical analyser at Barrow) are regularly calibrated against reference standards.
Overall, the monthly mean CH 4 concentrations from Howland show two obvious features. First, there is a pronounced seasonal cycle, with CH 4 varying by 30-40 ppb between a summertime minimum and wintertime maximum. Second, there is clear rising trend, with CH 4 increasing at a rate of almost 10 ppb per year, from an annual mean of just under 1910 ppb at the start of our measurement record to almost 1960 ppb at the end of our record. The excellent agreement between the Howland CH 4 measurements and the NOAA measurements-particularly for Barrow-demonstrates the long-term calibration of our instrument (specifically, the lack of calibration drift and hence the overall accuracy). Together with precision statistics reported by the manufacturer, this gives us confidence in the sustained quality of our CH 4 flux measurements. There is no evidence of degraded instrument performance over the six years of measurements. www.nature.com/scientificdata www.nature.com/scientificdata/ Assessment of quality control flags and random uncertainty in tower CH 4 fluxes. Across the more than 100,000 half-hourly periods covered by the eddy covariance dataset, there were missing CH 4 fluxes (due to power or instrument failure) only 8% of the time. A further 15% were assigned a Mauder and Foken 39 (M&F) QC flag of 2, indicating low quality measurements. Therefore, more than 75% of the time the fluxes were considered to be of "usable" quality, with 37% receiving an M&F QC flag of 0 (the highest quality) and 41% an M&F QC flag of 1.
Within EddyPro, the method of Finkelstein and Sims 45 was used to estimate random uncertainties in all calculated fluxes. Because the CH 4 fluxes measured at Howland are generally small, an important question is whether we are measuring signal (i.e. exceeding the detection limit for a measurable flux) or noise. If the ratio of the measured flux to the uncertainty has an absolute value greater than 2, then the measured flux can be considered significantly different from zero with high (95%) confidence. For CH 4 fluxes with an M&F QC flag of 0, the median uncertainty ratio was 2.4; 65% of the time the uncertainty ratio was greater than 2, and 32% of the time it was greater than 3. For CH 4 fluxes with an M&F QC flag of 1, the median uncertainty ratio was 1.9; 46% of the time the uncertainty ratio was greater than 2, and 22% of the time it was greater than 3. Thus, although CH 4 fluxes measured at Howland tend to be small in magnitude, they are commonly above the detection limit of the eddy covariance method.
The above three analyses indicate the overall quality and technical validity of the tower-based fluxes that we report here.
Automated chamber measurements. Uncertainties in our chamber-based soil flux measurement system have been assessed and quantified in several previous publications, which focused on the measurement of soil CO 2 efflux 27,46 . Indeed, based on work at Howland, we have previously concluded that "[w]hile … potential sources of measurement error and sampling biases must be carefully considered, properly designed and deployed chambers provide a reliable means of accurately measuring soil respiration in terrestrial ecosystems" 47 .
We have also published a detailed quality assessment of the uncertainties in CH 4 and N 2 O fluxes measured with our chamber system using the quantum cascade laser 28 . This analysis showed that the response time of the  Table 9. Correlation and regression statistics, by year, for agreement between ecosystem-atmosphere fluxes calculated using trace gas concentrations measured with two different gas analyzers (Picarro CRDS and LI-COR IRGA). Units are W m −2 for latent heat flux (LE), and μmol m −2 s −1 for CO 2 flux. N is the number of half-hourly measurements included in the comparison; correlation is Pearson's r; slope and intercept are least-squares regression statistics (y = LI-COR flux; x = Picarro flux). www.nature.com/scientificdata www.nature.com/scientificdata/ analyser was sufficiently fast, and sensitivity was sufficiently high, that we could measure fluxes quickly enough so as not to influence soil concentration gradients. Furthermore, we determined the minimum detectable fluxes using the method of Verchot et al. 48 ; for the automated chamber system deployed at Howland these were estimated to be very low: ± 0.12 μg CH 4 -C m −2 h −1 (=0.0028 nmol CH 4 m −2 s −1 ) and ±0.05 μg N 2 O-N m −2 h −1 (0.000496 nmols N 2 O m −2 s −1 ). Detection of such small fluxes is possible because of the high precision of the QCL instrument.
This previous work gives us high confidence in the overall quality of the soil fluxes of CO 2 , CH 4 , and N 2 O reported here.

Usage Notes
Recommended filtering criteria. The AmeriFlux-formatted data file included here has been filtered according to standard methods used for over two decades at Howland Forest, and is the data set recommended for most analyses. However, The Mauder and Foken 39 QC flags included in the EddyPro output could alternatively be used for data filtering (avoiding values flagged as "2").
At Howland, we have always adopted the friction velocity ("u* filtering") method of removing night-time data recorded under periods of high atmospheric stability and low turbulence 12 . We therefore recommend that data from nocturnal periods (PPFD ≤ 5 μmol m −2 s −1 ) be excluded when u* ≤ 0.25 m s −1 . These periods are indicated by a summary flag value of 4 in the QC and outlier flags file.
Gap filling and flux partitioning. A variety of methods are commonly used to fill gaps in meteorological and flux data sets so that annual averages or integrals can be estimated 49 . However, following standard AmeriFlux protocols, the data here have not been gap-filled. For gap-filling of meteorological data sets, methods based on reanalysis products have been developed 50 and these may be preferred to empirical methods based on mean diurnal variation. For gap-filling of CO 2 , H, and LE fluxes, the online gap-filling tool provided by the Max Planck Institute can be used (https://www.bgc-jena.mpg.de/bgi/index.php/Services/REddyProcWeb). This tool can also partition net fluxes to their underlying component fluxes, e.g. net CO 2 flux is partitioned to ecosystem respiration and gross primary production 51,52 , which is valuable for ecosystem C budget analyses. A variety of methods (including temperature relationships, neural networks, linear interpolation, mean diurnal variation, etc.) have been used for gap-filling of CH 4 fluxes 53 , but we are not aware of a consensus method.
Complementary data sets. Long-term data from the Howland AmeriFlux site are available through the AmeriFlux data portal (https://ameriflux.lbl.gov/sites/site-search/#keyword=Howland). This includes CO 2 , H 2 O, and energy fluxes measured via eddy covariance, as well as meteorological and environmental data at a 30 minute time step. Measurements have been conducted at the main Howland tower (AmeriFlux site US-Ho1) 30 since 1996 (the full US-Ho1 dataset available for download from AmeriFlux includes the filtered half-hourly AmeriFlux-format dataset described here); at the west Howland tower (AmeriFlux site US-Ho2) 54 , the site of a low-level N addition experiment, since 1998; and the east Howland tower (AmeriFlux site US-Ho3) 55 , the site of a shelterwood harvest experiment, since 2001.
Additional publicly-available data sets for Howland include the following: