Dipterocarpoidae genomics reveal their demography and adaptations to Asian rainforests

Dipterocarpoideae species form the emergent layer of Asian rainforests. They are the indicator species for Asian rainforest distribution, but they are severely threatened. Here, to understand their adaptation and population decline, we assemble high-quality genomes of seven Dipterocarpoideae species including two autotetraploid species. We estimate the divergence time between Dipterocarpoideae and Malvaceae and within Dipterocarpoideae to be 108.2 (97.8‒118.2) and 88.4 (77.7‒102.9) million years ago, and we identify a whole genome duplication event preceding dipterocarp lineage diversification. We find several genes that showed a signature of selection, likely associated with the adaptation to Asian rainforests. By resequencing of two endangered species, we detect an expansion of effective population size after the last glacial period and a recent sharp decline coinciding with the history of local human activities. Our findings contribute to understanding the diversification and adaptation of dipterocarps and highlight anthropogenic disturbances as a major factor in their endangered status.


Reporting on sex and gender
Reporting on race, ethnicity, or other socially relevant groupings

Recruitment
Ethics oversight Note that full information on the approval of the study protocol must also be provided in the manuscript.

Field-specific reporting
Please select the one below that is the best fit for your research.If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Ecological, evolutionary & environmental sciences study design
All studies must disclose on these points even when the disclosure is negative.
Most Dipterocarpoideae species are endangered, leaving the question that how they established dominance but subsequently declined.In this study, we combined the de novo assemblies of seven Dipterocarpoideae species and the population genomic information from two seriously endangered Dipterocarpoideae species , to reveal the molecular footprints associated with their adaptation to tropical environments and the factors causing their endangerment.In the study, we first generated DNA sequencing data (Illumina and PacBio sequencing data, which were quantitative data) and assembled the genomes of the studied Dipterocarpoideae species (only one tree of each species was used in genome assembly).We then conducted phylogenomic analysis using our assembled genomes and 12 genomes of Dipterocarpoideae species from another study, and six genomes of species from other families were also used as the outgroup.The analysis testing genome duplication events was performed using our assembled genomes to detect the potential historical events relevant to the diversification and the autotetraploidization of two Hopea species.To reveal the positively selected genes that are likely to associated with adaptation of Dipterocarpoideae, we conducted comparative genomic analysis by setting the genomes of five temperate tree species as the control and our assembled genomes as the treatment.We then focused on population genomic analyses via sequencing the genomes of different individuals of the two endangered species (30 and 32 wild individuals were sampled for these two species, i.e., the nature number of units/replicates were 30 and 32 for all the following analyses), and the sequencing data were quantitative.After revealing patterns of genetic variation at the genome level by SNP (single nucleotide polymorphism loci) calling, the SNPs across the genome were obtained for each sampled individual.For both species, we performed population demographic analysis to uncover the population dynamics across different historical stages, these results can help infer historical events events likely contributing to the endangered status of these two species.Moreover, using SNP data, we also identified the derived deleterious mutations to assess the genetic load within each of the two species by comparing with the genome of a related species to detected the derived mutations.The genetic structure and coefficient of inbreeding at the genome level were analyzed using SNP data for evaluating the genetic consequences of recently occurred population decline.

Location
We selected seven species from five major genera of Dipterocarpoideae to typify this subfamily, of which the two species used in population genomic study are typical endangered species listed by IUCN.In our study, we collected fresh leaves from one morphologically well-identified tree for each selected species, to achieve high-quality de novo genome assembly.To perform evolutionary and comparative genomic analysis, we included many published plant genomes (see Supplementary Table 8 for details of species and the data source).As to the population genomic study, we focused on the remnant populations of Hopea hainanensis and Hopea reticulata on Hainan Island, China (the records of population locations are only available on Hainan Island, and thus the distribution of these two species is highly likely to be restricted on this island).Fresh leaves were sampled from 30 and 32 wild trees for either species to meet the requirement for sample size in each statistical analysis.These sampled trees were located in all areas where the two species were recorded (see Supplementary Fig. 12a and Supplementary Table 18 for details of sampling locations), and thus our samples can be the representative for the two species.Moreover, for the in vitro functional validation of positively selected genes, we carried out three replications to confirm the enzyme activity of each candidate gene.
Based on the morphological records and specimens in Xishuangbanna Tropical Botanical Garden and the Research Institute of Tropical Forestry (Chinese Academy of Forestry), we selected one morphologically well-identified tree for each selected species for genome assembly.Then, with the guidance of staff in local forestry stations, we conducted field surveys throughout the known distributional ranges of H. hainanensis and H. reticulata on Hainan Island, and located each observed wild tree of these two species.About 100 wild trees of H. hainanensis were found in four areas (sampling sites), and c. 200 wild trees of H. reticulata were found, but it is only distributed in one area (see Supplementary Table 18 for details).Given the small population size and limited distribution ranges of these two species, we considered that there was only one population in each species.To ensure the genetic independence between samples and that our sample size must meet the minimum sample size required by population genomic studies (which is generally thought as 30 sampled individuals/population), we decided to collect leaf samples from trees with a minimum interval of 100 m.Finally, our sample size reached to 30 and 32 for H. hainanensis and H. reticulata, respectively, which met the required standard for statistical analyses in population genomics.
Sample collection for de novo genome assembly and population genomic study was conducted by Chao-Nan Liu, Xing-Hua Hu, Simon Segar, Shan Chen, Rong Wang, Yuan-Ye Zhang, Xiao-Yong Chen,Yuan-Yuan Ding, Yuan-Yuan Li, Gang Wang, Lu-Fan Chen, Stephen G Compton, Fang K Du, Run-Guo Zang, Dong-Hai Li, Ling Lu, Liang Tang, and Yang Yang.Experiments of the in vitro validation of enzyme activity was conducted by Kai Jiang, Jun-Yin Deng, Yu-Ting Jiang, Xin Tong and Rong Wang.Sequencing was carried out using Illumina and Pacbio platforms, and Kai-Jian Zhang, Chao-Nan Liu, Rong Wang, Xiao-Di Hu, Ling Kang, Wei-Wei Xu, and Zhuo-Xin Zu recorded the data.
Data collection in this study started from June 2018 and finished in May 2023.In June-December 2018, we selected the trees for de novo genome assembly, and conducted the field survey to determine the wild trees for population genomic study and collected leaves from these trees.From November 2018 to December 2020, genomes and transcriptomes of the seven Dipterocarpoideae species were sequenced and assembled.From June to July 2022, we performed in vitro validation of enzyme activity for the positively-selected genes associated with species' adaptation to tropical environments.To ensure the representativeness of our sampled trees, samples for the population genomic study covered the entire ranges of species distribution (c.1000 km2 and 50 km2 for H. hainanensis and H. reticulata, respectively) reported by previous studies (the sampling locations are shown in Fig. 4d).From February to May 2023, we collected leaf tissues from the trees used for de novo genome assembly of our target species to conduct flow cytometry experiments to estimate their genome size and ploidy.
No data were excluded.
We sampled one individual tree from each species for de novo genome assembly and 30 and 32 wild trees of H. hainanensis and H. reticulata respectively for population genomic study.For de novo genome assembly, the generally accepted method is to only use the genomic DNA from one individual to ensure the quality of assembled genome.We obtained high depth of sequencing data for both Illumina and PacBio sequencing (see Supplementary Table 2) to assemble the high-quality reference genomes that can be reproduced by other independent studies if necessary.In addition to sufficient biological replicates (i.e., our sample size) for population genomic study, we sequenced the genome of each sample at a very high sequencing depth (> 40X; see Supplementary Table 18) to ensure that most regions of samples' genomes have been sequenced multiple times.Thus, incorrect genomic sequences were less likely to exist in our study, and the results of sequence mapping is expected to be reproducible.For the in vitro functional validation of positively selected genes, all the three replications of each candidate gene showed the presence of enzyme activity.Overall, all attempts to repeat experiments were successful.
In this study, we chose the plants for population genomic study haphazardly.
Different groups of researchers and postgraduates conducted the experiments, without being informed of the results obtained from other experiments in this study.Analyses were carried out using open source programmes, and therefore the algorithms were not specifically designed for our data.
The area for field work was mainly covered by tropical lowland rainforests and was dominated by monsoon climate with the average temperature of c. 24 degrees Celsius and the mean precipitation of 1600 mm.As the dominant tree species, the studied Dipterocarpoideae species mainly grows in the valleys and on the mountain sides covered by tropical forests.During the field work period, we collected fresh leaves from sampled trees for genome assembly.

nature portfolio | reporting summary
April 2023 Access & import/export Disturbance Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies.Here, indicate whether each material, system or method listed is relevant to your study.If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.The field works involved our study are permitted and assisted by local forestry stations.
No serious disturbance was caused in this study because we only sampled fresh leaves from trees, without destroying any of the sampled plants and their crops.

Materials
of concern Policy information about dual use research of concern Hazards Could the accidental, deliberate or reckless misuse of agents or technologies generated in the work, or the application of information presented in the manuscript, pose a threat to: involve any of these experiments of concern: No Yes Demonstrate how to render a vaccine ineffective Confer resistance to therapeutically useful antibiotics or antiviral agents Enhance the virulence of a pathogen or render a nonpathogen virulent Increase transmissibility of a pathogen Alter the host range of a pathogen Enable evasion of diagnostic/detection modalities Enable the weaponization of a biological agent or toxin Any other potentially harmful combination of experiments and agents