Genetic and chemodiversity in native populations of Schinus terebinthifolia Raddi along the Brazilian Atlantic forest

Schinus terebinthifolia is a species native to different ecoregions in the Brazilian Atlantic Forest. The plant is listed on the National Relation of Medicinal Plants and recommended as phytomedicine, however while extractive exploitation prevails as the main route of raw material a significant variation of compounds will be detected. To assure the expansion of productive chain it is important to start by studying population diversity and chemical variations. We used SSR markers for studies of genetic structure among populations from dense ombrophilous forest (ES); the deciduous seasonal forest (SM); the savanna (DOU) and the sandbanks (ITA and MSP), and compared the results to their chemical profiles of essential oil. Genetic structure revealed differences among populations and significant fixation rates. Pairwise studies and Bayesian analysis showed similarities between ITA and SM and between DOU and MSP, proving that the patterns of distribution for the species do not follow the isolation by distance or similarity by environmental conditions. The comparison between PCA of genotypes and chemodiversity reinforces the unique profile for each population despite the environmental similarity observed and genetic analysis. The most divergent genotype and chemical group was found at the ombrophilous forest, strong evidence that we should undertake conservation efforts to prevent losses of biodiversity in that area.

Therefore, a series of studies encouraging the propagation of Brazilian peppertree by family farming and its use for reforesting degraded areas increased substantially. Some authors, considering the idea of the occurrence of chemotypes, defend the necessity of preventing extractive exploitation and the management of units for conservation of the species 17,18 . However, the Brazilian peppertree has a short generation time, is a highly prolific seed producer, and the seeds are readily consumed and spread by birds 2,19,20 , it is imperative to take into account the global warning about the difficult control of Brazilian peppertree dispersion and the risks it implies to the environment.
Studies of genetic diversity and spatial structure were carried out with distinct populations of Brazilian peppertree 5,[18][19][20][21] . Nevertheless, there are still many gaps related to the distribution and dispersion of the Brazilian peppertree, mainly because the sample groups in previous works have always been restricted to a few native ecogeographic regions. Essential oil composition may be affected by a variety of factors including soil properties, solar radiation, temperature, altitude, humidity and ontogenetic stage, and physiological stages of fruit maturation [22][23][24] . The sum of all these factors can potentially affect the bioactivity of a final product. Studies comparing the genetic diversity and chemodiversity of the species across different environmental conditions for growth is crucial for the development of elite lines for traditional medicine and the pharmaceutical industry.
The essential oil of the Brazilian peppertree is already used in the formula of many cosmetics and pharmaceutical products. Although considering the heterogeneity of its composition observed on pharmacological studies, we could not find in the literature studies in comparison of chemodiversity of the species considering environmental conditions of growth. Nevertheless, major constituents often observed are α-pinene, limonene, sabinene, β-pinene, α-copaene, α-phellandrene, germacrene and cis-salvene 8,[25][26][27][28] , with a great variation among their concentration.
In this study, we compare genetic diversity and chemodiversity among five populations representing distinct Brazilian ecosystems: decidual seasonal forest from Santa Maria-Rio Grande do Sul (Cfa); savanna from Dourados-Mato Grosso do Sul (Cwa); dense ombrophilous forest from São Mateus-Espírito Santo (Aw); sandbanks from Itaparica-Bahia (Af) and from Morro de São Paulo-Bahia (Af). We ask if differences in chemodiversity may be related to ecoregion or genetic differences among these populations.

Materials and methods
Plant material and DNA extraction. To  After collection, the plant material was sterilized in a 3% NaClO solution for 10 min, then rinsed, dried and stored in liquid nitrogen until DNA extraction. Total genomic DNA was extracted from leaf tissue using the CTAB method 29 . Microsatellites. We used a panel of seven microsatellite loci adapted from Williams et al. 30 (StAAT1, StAAT9, StAAT16, StAAT17, StAAT25, StAAT47, StAAT55) for polymorphisms. PCR were performed using a multiplex system as described in Table 2. PCR final volume (25 μl) contained: 4.5 µl of ultrapure water, 12.5 µl of PCR master mix, 1.0 µl of each primers, 2U of Taq DNA Polymerase and 2.3 µl of DNA (25-30 ng). The thermocycling program consisted of an   www.nature.com/scientificreports/ PCR product was applied and the electrophoresis run lasted for approximately 2 h, with constant 70 W. A 10 bp ladder molecular weight marker was loaded on the side edge of each gel to assign a fragment size to each band. The gels were stained with silver nitrate.
Data analysis. Parameters of loci diversity were estimated for all microsatellites, in all populations, using the CERVUS 3.0 program 31 . The parameters were as follows: allele frequency was estimated by direct counting, expected (He) and observed heterozygosity (Ho), polymorphic information content (PIC) and Hardy-Weinberg equilibrium (HWE). The diveRsity R package (R Development Core Team 2020) 32 was used to calculate allelic richness (AR) and estimates of Wright's F statistics (F IT , F IS and F ST ). Confidence intervals were obtained with 10,000 bootstrap replicates. The influence of null alleles was determined in FREENA 33 by computing the genetic divergence parameter (F ST ) values using an excluding null alleles (ENA) correction. After accounting for null allele frequencies, loci with frequencies of ≥ 0.2 were considered potentially problematic for the calculations. The P-value was adjusted using the Bonferroni procedure with the same statistical package. Population structure was evaluated by analysis of molecular variance (AMOVA) using the ARLEQUIN program version 3.5.2.2 34 to reveal the diversity within localities and among localities. The dendrogram was constructed by cluster analysis using the UPGMA method, based on calculations of Nei's genetic distances and 1000 bootstrap resamplings using poppr R package 35 .
Based on the allelic frequency of the seven microsatellite loci, the individuals were grouped in a given number of populations and probabilistically marked into groups inferred by Bayesian analysis using the Structure program 36 . The tests were performed using an admixture model where the allelic frequencies were correlated. To select the appropriate number of inferred populations, several analyzes were conducted with k (number of populations inferred) ranging from 2 to 6, with 30,000 interactions (burn-in period of 300), with three independent replications for each analysis. The real values of K were inferred from the magnitude of ΔK and given as a function of K, with the aid of the Structure Harvester program 37 following the model proposed by Evanno et al. 38 To corroborate the outputs from the Bayesian analysis, we also analyzed the datasets via a discriminant analysis of principal components (DAPC) 39 using the ADEGENET package 40 . The number of clusters was assessed using the function find.clusters, which runs successive rounds of K-means clustering with an increasing number of clusters (K). To select the optimal number of clusters, we chose the lowest associated Bayesian information criterion (BIC) value.
Chemical analysis of essential oil. We used a Clevenger-type apparatus for the extraction of the essential oil from ripe berries of Brazilian peppertree. A pool of fruits from several individuals representative from each ecogeographic region was weight (300 g) and submitted to 3 h of hydrodistillation. The Fig. 4 shows the moment and quality of fruits harvested (images with participant consent).
The essential oil was stored in sterile microtubules at − 20 °C until further analysis. The samples were prepared in hexane at the concentration 100 µg mL −1 . The gas chromatograph used was a GC-2010 Plus (Shimadzu, Kyoto, Japan) coupled to a mass spectrometer (GC-MS 2010 Ultra) using a DB-5 column (J and W, Folsom, California, USA) coated with 5% phenyl dimethylpolysiloxane on capillary fused silica (30 m long × 0.25 mm internal diameter × 0.25 μm film thickness). The conditions of analysis were as follows: injection volume 1 µL in split 1:20 mode; heating ramp with initial temperature of 50 °C, reaching 280 °C at a rate of 3 °C min −1 and remaining at the final temperature for 10 min; and injector temperature of 280 °C. The temperatures of detector and transfer line were 290 °C. The parameters of mass spectrometry included scanning MS voltage electron impact ionization of 70 eV with m/z 40-600 and scanning range of 0.3 s. Compound identifications were performed using the calculated retention index (RI) and the linear alkane (C 7 -C 40 , Sigma Aldrich purity ≥ 90%) standard, along with comparisons of the RI with indexes found in the literature 41 and also used our interpretation of mass spectra obtained from the samples and compared with the databases (NIST21 and WILEY229). The peak area of each compound was determined by manual integration of each total ion chromatogram (TIC). Then all areas were transformed into relative percentage areas (relative abundance percentual).
Ethical approval. The access to genetic resources were assured by SISGEN license nº A9CDAAE.

Results
Population structure analysis revealed differentiation among them of 37.72% (Table 3) and significant fixation index based on F ST (P < 0.001).
The parameters evaluated for genotypic characterization of the populations based on seven microsatellite loci are described in Table 4. The population ES (São Mateus-ES) had the higher allelic richness (4.98), with the inbreeding coefficient being significantly different from zero (0.16), the greatest expected heterozygosity was shown by the population of SM (Santa Maria-RS) (0.53). The DOU population (Dourados-MS) presented the  StAAT16, StAAT17, StAAT55). After null allele correction (ENA), overall F ST changed only slightly (from 0.377 to 0.369). Pairwise F ST values were highest for DOU, followed by MSP while FST was the lowest between ITA and SM (Table 5).
UPGMA analysis also confirmed greater similarity between the SM and ITA populations, and revealed considerable distance between ITA and MSP, despite them being geographically close together in the same type of ecoregion (Fig. 2).
The STRU CTU RE analysis and the Evanno method indicated that K = 3 was the most likely number of populations and grouped SM and ITA together and MSP and DOU together. In contrast, the DAPC analysis grouped individuals into their five respective populations with a few individuals in the SM population assigned to ITA (Fig. 3).
CG analysis detected 38 compounds, four of which were at relatively high concentration and found in all five populations including α-pinene, sabinene, β-pinene, limonene. A number of compounds were not observed in all five populations including ζ-salvene (only in MSP, SM and DOU), β-ocimene, evadone (only in ITA and MSP), aromadendrene, d-germacrene, d-davanone (only in MSP), carotol (only in ITA and SM) and eremoligenol (only in ES) ( Table 6).
Principal Components Analysis (PCA) plots were performed using genotypes and chemical compound data. Clusters found for genotypes reinforced the patterns of previous genetic structure analysis. ES individuals group in a separate cluster, while ITA and SM individuals cluster together. MSP and DOU individuals form two clusters that are close to each other and separate from the other populations. However, when the populations were grouped by their chemodiversity, the PCA revealed that each population has a unique chemical profile (Fig. 4).

Discussion
As expected for an outcrossing species, the inbreeding coefficients (F IS ) were low and genetic diversity was high in most of the sampled populations with the exception of DOU (Cerrado-Brazilian Savanna). The excess of heterozygotes and low allelic richness suggest this population has experienced a recent bottleneck 45,46 . This population is located in an area recently classified as an 'Ecological Transition Zone' due to anthropic pressures (deforestation, fire regime, settlement, etc.) which may be responsible for the bottleneck. Similar bottlenecks in Brazilian peppertree populations have been detected in other fragmented areas in Brazil and highlight the negative effect that anthropogenic factors can have on this species in its native range. The ES population was distinct in all clustering analyses and had the highest genetic diversity compared to the other sites. The ES population is located in an important conservation area (dense ombrophilous forest) which may have protected this population  This assumption can also be supported by the values of an observed heterozygosity higher than the expected (He 0.58 and 0.46, respectively). When a population experiences a reduction of its effective size, it generally develops a heterozygosity excess which may persist only for a certain number of generations until a new equilibrium is established 42 . Moreover, DOU showed the lowest rates for allelic richness (2.43). Reinforcing the hypothesis of bottleneck since the loss of the average number of alleles per locus is also related to genetic drift typical in populations that have their effective size dramatically reduced 43 .
To understand the genetic diversity of Brazilian peppertree it's important to consider the high ecoplasticity and invasive behavior of the species. The populations analyzed in this work are originally from five different ecogeographic regions with particular characteristics (Table 1). Nevertheless, all five populations showed high genetic diversity (Table 4), especially when taking into account the rates from previous studies 21, 44 , reinforced by a high differentiation index among populations (37.72%) and it proves that neither the time nor human pressure were enough to disrupt their distribution patterns-typical in invasive species. Possibly due to the high spatial density and efficient mechanisms for seed dispersal which allowed an intense gene exchange and consequent genotypic recombination, responsible to increase its evolutionary potential and adaptability.
Structure analysis (Fig. 3) showed us three clusters based on genetic similarity (SM + ITA; DOU + MSP and ES). Interestingly the clusters have no correlation with the distances where samples were collected, confirming what Ruas et al. 44 had inferred about the species distribution patterns do not fit within the isolation by distance patterns. Pinto et al. 5 have analyzed haplotypes distribution of Brazilian peppertree considering different biomes, and the authors suggested a cluster of 3 haplotypes based on the intravarietal polymorphisms. Nevertheless, in our study, population from ES showed the highest distance instead of DOU, while SM and ITA corroborate by clustering together. ES also presented the higher allelic richness (4.98), coincidentally this population occurs in the ecogeographic region of greater preservation compared to the others (dense ombrophilous forest).
The patterns of genetic variation across a species range can provide information about the processes of distribution and help us to understand how ecology, evolution and geography intersect 45 . We believe the similarities found between populations from SM and ITA may be correlated to the same colonization event and a reasonable explanation for divergence of the other three genotypes may be due to environmental barriers-DOU occurs in a fragmental zone; MSP in an isolated island and ES surrounded by the dense forest. Thus, an isolation by environment (IBE) could be expected under the action of natural selection and/or environmentally influenced mating or migration 46,47 . Furthermore, Williams et al. 21 described the existence of well-structured populations of Brazilian peppertree in their native habitat. This phenomenon becomes evident in pairwise study when greater differentiation rates are obtained by DOU and MSP, respectively, two populations established in a narrow area and environmentally isolated 48,49 .
The dendrogram based on Nei distance (Fig. 2) confirms the occurrence of clusters, but if we consider the values on pairwise estimation (Table 5)   www.nature.com/scientificreports/ chemical profile. Secondary metabolites are mostly related to environmental factors including soil properties, solar radiation, temperature, altitude, humidity and harvest time 22,23 , it would be unexpected some clusters formation. Consistent with this expectation, our sampled populations, with the exception of ITA and MSP, were in different ecosystems and had unique chemical profiles. ITA and MSP are located geographically close together in the same pedo-climatic conditions and yet they had the greatest divergences for chemical composition between the populations. This suggests that ecosystem designations are not capturing all of the relevant variables that could result in chemical profile differences. For instance, differences in predators, disease, or pollinators could also influence secondary metabolite synthesis. Future studies will need to take into consideration a wider variety of factors to determine which are important for differences in chemodiversity of populations. From a pharmacological perspective, it is important to consider every factor that might influence the bioactivity of a phytomedicine. Understanding how environmental factors affect the chemodiversity of Brazilian peppertree will make it possible to develop elite lines for industry. As more studies are conducted across a wide variety of systems, we may be able to predict how different factors will influence patterns of chemodiversity. www.nature.com/scientificreports/

Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.