Soil nitrogen concentration mediates the relationship between leguminous trees and neighbor diversity in tropical forests

Legumes provide an essential service to ecosystems by capturing nitrogen from the atmosphere and delivering it to the soil, where it may then be available to other plants. However, this facilitation by legumes has not been widely studied in global tropical forests. Demographic data from 11 large forest plots (16–60 ha) ranging from 5.25° S to 29.25° N latitude show that within forests, leguminous trees have a larger effect on neighbor diversity than non-legumes. Where soil nitrogen is high, most legume species have higher neighbor diversity than non-legumes. Where soil nitrogen is low, most legumes have lower neighbor diversity than non-legumes. No facilitation effect on neighbor basal area was observed in either high or low soil N conditions. The legume–soil nitrogen positive feedback that promotes tree diversity has both theoretical implications for understanding species coexistence in diverse forests, and practical implications for the utilization of legumes in forest restoration.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.

For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection We used R 3.4.2 software to analyze the data and draw the figures.

Data analysis
All data are analyzed by R 3.4.2 software.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Data comes from 11 ForestGEO dynamic plots (https://forestgeo.si.edu/) across the whole world before we could use the data in this paper. All users can apply for data usage in that website.

Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Blinding
Describe whether the investigators were blinded to group allocation during data collection and/or analysis. If blinding was not possible, describe why OR explain why blinding was not relevant to your study.

Behavioural & social sciences study design
All studies must disclose on these points even when the disclosure is negative.

Study description
Briefly describe the study type including whether data are quantitative, qualitative, or mixed-methods (e.g. qualitative cross-sectional, quantitative experimental, mixed-methods case study).

Data collection
Provide details about the data collection procedure, including the instruments or devices used to record the data (e.g. pen and paper, computer, eye tracker, video or audio equipment) whether anyone was present besides the participant(s) and the researcher, and whether the researcher was blind to experimental condition and/or the study hypothesis during data collection.

Timing
Indicate the start and stop dates of data collection. If there is a gap between collection periods, state the dates for each sample cohort.

Data exclusions
If no data were excluded from the analyses, state so OR if data were excluded, provide the exact number of exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established.

Non-participation
State how many participants dropped out/declined participation and the reason(s) given OR provide response rate OR state that no participants dropped out/declined participation.

Randomization
If participants were not allocated into experimental groups, state so OR describe how participants were allocated to groups, and if allocation was not random, describe how covariates were controlled.

Ecological, evolutionary & environmental sciences study design
All studies must disclose on these points even when the disclosure is negative.

Study description
We use spatial statistics method to study the neighbor effects of legumes and explore the factors related to neighbors effects in different forest plots in the tropics.

Research sample
Tree measurement data are obtained from 11 plots, with areas ranging from 16 to 60 ha.

Sampling strategy
All the plot censuses followed a standard enumeration and included all stems with diameter at the breast height (DBH diameter ≥ 1 cm at 1.3 m above the ground as required from the ForestGEO plot network, https://www.forestgeo.si.edu/). All stems were identified to species, measured and mapped to their relative spatial coordinates within each forest plot.

Data collection
We carried out tree censuses in 11 forest plots widely distributed from latitude 5.25°S to 29.25°N that included the principal tropical forest formations of the world. All co-authors contribute to the data collection. Soil nutrients data are obtained from Oak Ridge nature research | reporting summary
Timing and spatial scale One one time tree census data was used for each forest plot.

Data exclusions
No exclusions.

Reproducibility
The findings was verified by spatial statistics method and can be repeated by using the same 11 plot data and soil nutrient data.

Randomization
Randomization was applied for the species neighbor diversity calculation. It is described in detail in the method part.

Blinding
Data are collected from the field and have been used by scientists from different countries. It is not indoor experiment that needs blinding design.
Did the study involve field work?

Yes No
Field work, collection and transport Field conditions Eleven plots data come from different countries in the tropics, as listed in Table 1.

Location
Eleven plots data come from different countries in the tropics, as listed in Table 1.
Access and import/export All co-authors contribute to the data collection in different forest plots. Soil nutrients data are obtained from Oak Ridge National Laboratory Distributed Active Archive Center (https://daac.ornl.gov).

Disturbance
There is only natural disturbance in the plots and no human disturbance.

Reporting for specific materials, systems and methods
We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Authentication
Describe the authentication procedures for each cell line used OR declare that none of the cell lines used were authenticated.

Mycoplasma contamination
Confirm that all cell lines tested negative for mycoplasma contamination OR describe the results of the testing for mycoplasma contamination OR declare that the cell lines were not tested for mycoplasma contamination.

Commonly misidentified lines (See ICLAC register)
Name any commonly misidentified cell lines used in the study and provide a rationale for their use.

Palaeontology Specimen provenance
Provide provenance information for specimens and describe permits that were obtained for the work (including the name of the 4 nature research | reporting summary

October 2018
Specimen provenance issuing authority, the date of issue, and any identifying information).

Specimen deposition
Indicate where the specimens have been deposited to permit free access by other researchers.

Dating methods
If new dates are provided, describe how they were obtained (e.g. collection, storage, sample pretreatment and measurement), where they were obtained (i.e. lab name), the calibration program and the protocol for quality assurance OR state that no new dates are provided.
Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information.

Animals and other organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research

Laboratory animals
For laboratory animals, report species, strain, sex and age OR state that the study did not involve laboratory animals.

Wild animals
Provide details on animals observed in or captured in the field; report species, sex and age where possible. Describe how animals were caught and transported and what happened to captive animals after the study (if killed, explain why and describe method; if released, say where and when) OR state that the study did not involve wild animals.

Field-collected samples
For laboratory work with field-collected samples, describe all relevant parameters such as housing, maintenance, temperature, photoperiod and end-of-experiment protocol OR state that the study did not involve samples collected from the field.

Ethics oversight
Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Human research participants
Policy information about studies involving human research participants

Data quality
Describe the methods used to ensure data quality in full detail, including how many peaks are at FDR 5% and above 5-fold enrichment.

Software
Describe the software used to collect and analyze the ChIP-seq data. For custom code that has been deposited into a community repository, provide accession details.

Flow Cytometry
Plots Confirm that: The axis labels state the marker and fluorochrome used (e.g. CD4-FITC).
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.

Methodology Sample preparation
Describe the sample preparation, detailing the biological source of the cells and any tissue processing steps used.

Instrument
Identify the instrument used for data collection, specifying make and model number.

Software
Describe the software used to collect and analyze the flow cytometry data. For custom code that has been deposited into a community repository, provide accession details.
Cell population abundance Describe the abundance of the relevant cell populations within post-sort fractions, providing details on the purity of the samples and how it was determined.

Gating strategy
Describe the gating strategy used for all relevant experiments, specifying the preliminary FSC/SSC gates of the starting cell population, indicating where boundaries between "positive" and "negative" staining cell populations are defined.
Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information.

Magnetic resonance imaging
Experimental design Design type Indicate task or resting state; event-related or block design.

Specify the number of blocks, trials or experimental units per session and/or subject, and specify the length of each trial or block (if trials are blocked) and interval between trials.
Behavioral performance measures State number and/or type of variables recorded (e.g. correct button press, response time) and what statistics were used to establish that the subjects were performing the task as expected (e.g. mean, range, and/or standard deviation across subjects).