The influence of anthropogenic habitat fragmentation on the genetic structure and diversity of the malaria vector Anopheles cruzii (Diptera: Culicidae)

Fragmentation of natural environments as a result of human interference has been associated with a decrease in species richness and increase in abundance of a few species that have adapted to these environments. The Brazilian Atlantic Forest, which has been undergoing an intense process of fragmentation and deforestation caused by human-made changes to the environment, is an important hotspot for malaria transmission. The main vector of simian and human malaria in this biome is the mosquito Anopheles cruzii. Anthropogenic processes reduce the availability of natural resources at the tree canopies, An. cruzii primary habitat. As a consequence, An. cruzii moves to the border of the Atlantic Forest nearing urban areas seeking resources, increasing their contact with humans in the process. We hypothesized that different levels of anthropogenic changes to the environment can be an important factor in driving the genetic structure and diversity in An. cruzii populations. Five different hypotheses using a cross-sectional and a longitudinal design were tested to assess genetic structure in sympatric An. cruzii populations and microevolutionary processes driving these populations. Single nucleotide polymorphisms were used to assess microgeographic genetic structure in An. cruzii populations in a low-endemicity area in the city of São Paulo, Brazil. Our results show an overall weak genetic structure among the populations, indicating a high gene flow system. However, our results also pointed to the presence of significant genetic structure between sympatric An. cruzii populations collected at ground and tree-canopy habitats in the urban environment and higher genetic variation in the ground-level population. This indicates that anthropogenic modifications leading to habitat fragmentation and a higher genetic diversity and structure in ground-level populations could be driving the behavior of An. cruzii, ultimately increasing its contact with humans. Understanding how anthropogenic changes in natural areas affect An. cruzii is essential for the development of more effective mosquito control strategies and, on a broader scale, for malaria-elimination efforts in the Brazilian Atlantic Forest.


Hypothesis 1 and Hypothesis 2
Genetic structure. Global statistics performed to measure population structure (D 57 , F ST and G'' ST 58 ) and the inbreeding coefficient (F IS ) between populations, yielded low values that were not statistically significant for the tests of hypothesis 1, indicating that there is no genetic structure between pooled canopy and ground-level populations from the Natural, Suburban/Rural and Urban environments ( Table 3). All population structure estimates for the tests of hypothesis 2 for populations from Natural and Suburban/Rural environments also yielded low values that were not statistically significant (Table 3). In contrast, estimates for the Urban population revealed low and statistically significant values for D, F ST and F IS , except for the estimator G" ST , which showed moderate genetic structure, albeit with a non-significant P value (Table 3). Pairwise estimates of F ST , G" ST and D among all populations in hypothesis 2 revealed significant estimates for the same pairs of populations, Natural ground x Suburban/Rural ground, Natural ground x Urban ground, and Urban canopy x Urban ground (S3  www.nature.com/scientificreports/ Table). Suggesting that the ground-level population in all areas is more structured than the tree-canopy populations. However, after the correction for multiple tests, all P values for all tests became non-significant. Analysis of molecular variance (AMOVA) carried out to detect population differentiation, showed that 99.83% and 99.24% of the variance for the tests of hypothesis 1 and 2, respectively, were estimated to be within populations, with non-significant P values (S4 Table). These results indicate that there is no population structure between the populations from the Natural, and Suburban/Rural environment. For the Urban environment, however, the results revealed that there may be genetic structure at a low hierarchical level.
Multivariate statistics. Principal component analysis (PCA) performed to assess genetic diversity among sampled populations revealed a high degree of overlapping in the test to verify hypothesis 1. The data variance explained by PC1 and PC2 of PCA was 2.4% and 2%, respectively (Fig. 3a). Overall, the population from the canopy habitat was more diverse and had more outliers. In the test of hypothesis 2, the data variances explained by PC1 and PC2 for the Natural population were 3.8% and 3.4%, respectively (Fig. 3b). There is a high degree of overlapping in the plot, and the mosquitoes collected at the canopy are genetically more diverse than the ones  www.nature.com/scientificreports/ collected at ground-level. The data variances explained by PC1 and PC2 for the Suburban/Rural population were 3.8% and 3.5%, respectively (Fig. 3b). The plot shows overlapping, indicating similarity, and that both populations were highly dispersed. For the Urban population, the data variances explained by PC1 and PC2 were 5.1% and 4.9%, respectively (Fig. 3b). In the urban environment plot the ground-level population is more diverse than the canopy population, contrasting with the trend seen in the Natural and Suburban/Rural environments.
Bayesian cluster analysis. The results of the Bayesian cluster analysis performed to infer population structure and subsequent application of the Evanno method, which identifies genetically homogeneous groups of individuals, revealed low structure in the An. cruzii populations studied (Fig. 4). For hypothesis 1, the most likely number of genetic groups was K = 2 according to the ΔK estimator. The STRU CTU RE plot revealed two genetic groups across the two populations without a significant genetic structure between them (Fig. 4a). For hypothesis 2, the most likely number of groups for the Natural population was K = 2 according to the ΔK estimator. The STRU CTU RE plot showed genetic groups across the two populations without a clear genetic structure between them. The most likely number of groups for the Suburban/rural population was K = 3 according to the ΔK estimator. The STRU CTU RE plot revealed three genetic groups well distributed between the two populations, and the canopy population showed slightly more variation in group membership than the ground-level population. The most likely number of groups for the Urban population was K = 3 according to the ΔK estimator. The STRU CTU RE plot revealed three genetic groups which varied in membership between sympatric canopy and ground-level populations. While the canopy population is less structured with a predominance of the color green, the groundlevel population appears to be more structured with an equal prevalence of the three genetic groups (Fig. 4b).

Hypothesis 3, Hypothesis 4 and Hypothesis 5
The tests for hypotheses 3, 4, and 5 did not reveal significant genetic structuring between populations, indicating that the three hypotheses should be rejected. Although genetic estimators (D, F ST , and G" ST ) and inbreeding coefficient (F IS ) showed low estimates for the tests of hypothesis 3 with significant P value, after correction for multiple tests all P values became non-significant. Furthermore, the lack of structuring was corroborated by PCA and Bayesian analysis, which showed no signs of cross-sectional structuring between the populations from the Natural, Suburban/Rural and Urban environments (S1 Table; Table; S2 Fig). AMOVA showed that more than 99% of the variance was estimated to be within populations in the tests for all hypotheses with non-significant P values (S4 Table).

Discussion
Deforestation and fragmentation of forests can have a negative impact on the abundance and survival of An. cruzii, a primary vector of human malaria in the Atlantic Forest, as this species is closely associated with forested areas 59 . However, because of its extremely anthropophilic nature, anthropogenic modifications can increase contact between this species and humans. We tested five hypotheses using a cross-sectional and longitudinal design to assess genetic structuring in sympatric An. cruzii populations and cross-sectional genetic structuring and microevolution in populations of this species from environments with different degrees of anthropogenic modifications.
Our findings suggest that An. cruzii has weaker fine-scale genetic structure than other Anopheles vectors in Brazil 53,55 . Nevertheless, the tests of hypothesis 2, in which An. cruzii populations collected in tree canopies and at ground level habitats in each environment were compared, yielded interesting results. PCA showed a different pattern of diversity in the Natural and Suburban/Rural environments, where the canopy habitat populations were more diverse, while in the Urban environment the ground-level habitat population appeared to be more diverse. To our knowledge, this is the first study in which An. cruzii populations collected in tree canopies and at ground level have been compared genetically. Our results show a clear structuring pattern between tree-canopy and ground-level populations from the Urban environment, and this was corroborated by all the analyses carried out. PCA and Bayesian cluster analyses showed that the urban ground-level population is more diverse and structured than the urban tree-canopy population.
While SNPs are reliable molecular markers for analyzing structure in mosquito populations on a microgeographic scale 53,60 , the SNP analyses showed higher expected heterozygosity than observed heterozygosity in all the hypotheses tested, indicating low levels of heterozygosis among the populations. Moreover, there was no evidence of cross-sectional structure between populations from the Natural, Suburban/Rural or Urban environments according to the PCA, Bayesian analysis and AMOVA, indicating that the differences between populations are very subtle, and only when closely related populations were compared were signs of local-level structuring observed. Although the genetic structure statistics (D, F ST and G" ST ) and the inbreeding coefficient (F IS ) yielded statistically significant values for the test of hypothesis 3, all the estimated values were quite low indicating low genetic structure. Furthermore, the significant F IS values could indicate that the heterozygosity in An. cruzii populations could be reduced by inbreeding of individuals. However, after correction for multiple tests, all P values became non-significant. It is important to consider that this study is investigating subtle variations in sympatric An. cruzii populations in both microgeographic and fine-temporal scales, which in turn can impact the significance of P values for multiple tests as it does not consider the power of the tests 61 www.nature.com/scientificreports/ www.nature.com/scientificreports/ The specimens analyzed here were collected in three different environments and were considered three different populations. However, as there are no geographic barriers, such as mountains, bodies of water, or a large canyon between the environments, our results indicated that the specimens represent a large, well-distributed population with a high overall genetic homogeneity. Previous studies carried out with mosquitoes from Brazil on a micro-geographic scale, suggest that native species exhibited weaker genetic structure while invasive species were shown to be more genetically structured and diverse 10,11,63,64 .
In the present study, An. cruzii populations showed no signs of genetic structure and variation when analyzed over time although microevolutionary changes, reflected as a variation in wing-shape over time in this species in the same areas, appear to vary greatly in urban environments compared with sylvatic and peri-urban environments 33 . Molecular and phenotypic markers are often contradictory, and in a previous study by our group geometric morphometrics showed clear structuring over time in An. cruzii but genetic analysis did not indicate significant structuring in the same populations 33 . Studies comparing the results of analyses using wing geometry and microsatellite markers in mosquitoes have shown that wing patterns change faster than genetic patterns 65 . While shape in mosquitoes is determined by multiple genes and their expression, SNPs can be filtered to ensure that they are independent, neutral and bi-allelic markers. Hence, differences in outcomes are to be expected for these two approaches 66,67 .
The loss of genetic diversity in native populations may be associated with anthropogenic modifications in the environment (e.g., deforestation, agriculture and urbanization) 1,53,68 . In addition, there is growing evidence that animal and plant populations experience divergent selection in urban and nonurban environments, which can explain the increased levels of genetic diversity in the population from ground-level habitat in the urban environment in the present study 1,5 . While there is no evidence of the presence of An. cruzii in highly urbanized metropolitan areas 69 , this species is constantly found in human dwellings close to forest fragments and in ornamental bromeliads in the anthropic environment 32,33,39,70 . Although An. cruzii is a bromeliad-specialist and its abundance is clearly associated with humidity, forest cover and bromeliad availability, studies have shown that anthropogenic modifications in natural areas can drive vertical dispersal in this mosquito and a variation in its phenotype and genotype, which in turn may drive the dynamics of Plasmodium transmission involving this important vector 33,37 .
A previous study showed that although anthropogenic changes in the environment can lead to a reduction in the abundance of An. cruzii, this species is found more frequently at ground level habitat in environments with increased deforestation due to human use 37 , indicating that the acrodendrophily of An. cruzii can be affected by human modifications in natural areas and areas close to dwellings 37,71 . These ecological findings corroborate the findings of our genetic study, suggesting that the presence of An. cruzii at ground level could be associated with proximity to anthropogenically modified environments. This varying acrodendrophilic behavior of An. cruzii is known to have important epidemiological implications for the dynamics of malaria transmission 28,34 .
Our findings suggest that the changes promoted by human activities play an essential role in the vertical dispersal of An. cruzii. Anthropogenic fragmentation of the Atlantic Forest biome leads to a reduction in natural hosts for this mosquito to blood feed on, which in tree canopies would be mainly birds and nonhuman primates 24,71,72 . This reduction may be driving An. cruzii to look for blood sources at ground level in places with human dwellings close to forest patches, which corroborates the anthropophilic behavior of An. cruzii and provides further evidence for the transmission of human and simian Plasmodium to humans living in the vicinity of forested areas 28,37,71 .
Furthermore, previous studies have shown that in some areas An. cruzii is more present inhabiting tree canopies, while in other areas it can be found in similar numbers in tree canopies and at ground level 37,73 . In a study conducted at the Horto Florestal da Cantareira, São Paulo, 99% of An. cruzii specimens collected were found in the tree canopies and despite the high malaria prevalence among the simian population there were no human cases of malaria 73 , whereas in the state of Santa Catarina, in the south of Brazil, a region where simian and human malaria have been reported, An. cruzii was found abundantly in both tree canopies and at ground level 73 . The same phenomenon is likely occurring in Parelheiros, a region with low malaria endemicity in the city of São Paulo where An. cruzii was collected in high numbers in both tree canopies and at ground level habitats 29,37,45 .
The different patterns of An. cruzii acrodendrophily led previous authors to believe that this species could represent two different species 34 . After collecting An. cruzii specimens from tree canopies and ground level, Deane et al. (1984) marked them with dye and released them to test their pattern of vertical dispersal. When they had collected the marked specimens again, the authors noticed that mosquitoes released at ground level were collected in tree canopies and vice-versa. They suggested that the mosquitoes in the study area belonged to the same species but did not dismiss the possibility of An. cruzii being two species 34 . Although there is probably still movement between An. cruzii from tree canopies to ground level and vice-versa, our results suggest that in areas close to human dwellings, An. cruzii populations are more diverse and genetic structured at ground level.
Contrarily to this study, previous studies conducted in different regions of Brazil found significant genetic variation between populations of An. cruzii [74][75][76] . Evidence of structuring in An. cruzii populations from high altitudes and low altitudes were observed using both nuclear and mitochondrial markers 74,75 . Furthermore, it has been suggested that An. cruzii may be a complex with at least two sibling species occurring in the southern and southeastern regions of Brazil 76,77 . However, these studies were conducted using molecular markers other than SNPs and on a larger scale than the present study. The present study is not without limitations. The sampling was conducted in only one site of each type of habitat (Natural, Suburban/Rural and Urban) with no replications, therefore, the genetic differences found in this study, even though unlikely, could be due to a factor unique to that one site.
Effective vector-control strategies are very important to eradicate malaria. Because of their complex biology and ecology and similar genetic characteristics, An. cruzii populations are particularly difficult to control. The species moves from natural areas to domestic and peridomestic areas in search of blood sources, posing a Scientific Reports | (2020) 10:18018 | https://doi.org/10.1038/s41598-020-74152-3 www.nature.com/scientificreports/ challenge for control strategies based on integrated vector management 28,78 . In the past, deforestation and bromeliad destruction were common approaches adopted to reduce the population of this vector mosquito during malaria outbreaks, but these strategies are no longer acceptable 79 . Deforestation leads to loss of biodiversity, resulting in a reduction in the number of dead-end Plasmodium hosts in the Atlantic Forest and an increase in the likelihood of Plasmodium transmission to humans 23,28 .
Considering that An. cruzii is not a synanthropic species, a significant genetic structure between populations from Natural, Suburban/Rural, and Urban areas as well as tree-canopy and ground-level at the microgeographic scale indicates the disruption of its natural environments. Such presence of genetic variation in sympatric An. cruzii populations can be used as a proxy for ecosystem health, and to infer both the impact of deforestation and defaunation in its distribution as well as the extent of human exposure to this mosquito vector species. Environmental planning, forest conservation, and control of illegal human settlements and ecotourism in protected areas could play an important role in reducing the presence of this vector at ground level, resulting in turn in a likely reduction in plasmodial infections in the Atlantic Forest 23,28 .

Conclusions
Our findings point to the existence of structuring between populations of An. cruzii from tree canopies and ground level habitats in the urban environment, with the ground-level population showing higher genetic variation. Although this study has some limitations, our results point to the possibility that anthropogenic modifications leading to habitat fragmentation may be driving the biology of An. cruzii and maintaining genetic diversity and structure in populations more present at ground level. Our study design could be used to assess the genetics of An. cruzii populations in malaria-transmission areas where this species is abundant at ground level.

Material and methods
Study area. The chosen study area was the Capivari-Monos conservation area in the subdistrict of Parelheiros, São Paulo, Brazil. Three environments with different levels of anthropogenic environmental modifications were chosen: Natural-Atlantic Forest remnants on private property; Suburban/Rural-a transition area between Atlantic Forest remnants and a cattle range; and Urban-the Engenheiro Marsilac neighborhood. The map of sampling locations was produced using ArcGIS 10.2 (Esri, Redlands, CA).
Vegetation cover at each collection site was identified using the "Map of vegetation remnants of the Atlantic Forest Biome in the municipality of São Paulo" available at https ://geosa mpa.prefe itura .sp.gov.br/Pagin asPub licas /_SBC.aspx. A one-kilometer buffer, defined around each georeferenced collection site, was used to characterize vegetation cover at each collection site. The area within each buffer was then used to measure the proportion of vegetation cover around each collection site. Measurements were made in QGIS 2.18 (https ://www.qgis.org) 37 .
Mosquito collections and identification of specimens. Mosquito collections were conducted once a month from February 2016 to July 2017. Adult specimens were collected using CO 2 -baited CDC light traps (CDC-LT) 80 placed in tree canopies (at heights > 12 m) and at ground level (at a height of 1 m) for 14 h including dusk and dawn, when An. cruzii blood-feeding activity peaks. A Shannon light trap was also set up at each collection site for 2 h after dusk. Mosquitoes were collected over 17 months with the traps set in different locations within each collection site. When possible, 30 specimens collected throughout 17 months were randomly selected from each trap in each collection site to avoid bias introduced by analyzing siblings 81,82 . Mosquitoes were identified with the aid of taxonomic keys 26 and stored in 1.5 mL tubes with isopropyl alcohol at room temperature until DNA was extracted. SNP genotyping. Genomic DNA was extracted from mosquitoes' whole body using the DNeasy Blood and Tissue kit (Qiagen) following the manufacturer's protocol. A Qubit Fluorometer (ThermoFisher, Dietikon, Switzerland) was used to determine DNA concentration, and electrophoresis on 1% agarose gel was performed to assess the quality and purity of the samples. Raw genomic DNA was then sent to SNPsaurus (SNPsaurus, Oregon, USA) to be converted into nextRAD genotyping-by-sequencing libraries 83 .
Using Nextera technology (Illumina, Inc), genomic DNA was simultaneously fragmented and tagged with short adapter sequences to the cleaved ends. The Nextera reaction was scaled to fragment 10 ng of genomic DNA, of which up to 5 ng was used for input to help dilute inhibitors. Fragmented DNA was then amplified using Phusion Hot Start Flex DNA Polymerase (New England Biolabs, Inc., Ipswich, MA) with one of the primers matching the adapter and extending 8 nucleotides into the genomic DNA with the selective sequence TGC AGG AG. Thus, only fragments starting with a sequence that can be hybridized by the selective sequence of the primer will be efficiently amplified. PCR conditions were as follows: 3 min at 72 °C, 3 min at 98 °C, followed by 25 cycles of 45 s at 98 °C and 1 min at 75 °C. The nextRAD libraries were sequenced on a HiSeq 4000 with four lanes of single-end 150 bp reads (University of Oregon).
The genotyping analysis were processed using SNPsaurus nextRAD custom scripts (SNPsaurus, Oregon, USA) 83,84 that trimmed the reads with BBDuk (BBMap tools, https ://sourc eforg e.net/proje cts/bbmap /). Next, a de novo reference genome was created by collecting 10 million reads equally divided between the samples and excluding reads using the thresholds of 7 and 700, which have been empirically determined to produce de novo reference loci that are useful for alignment 83 . The remaining loci were then aligned to each other to identify allelic loci and collapse allelic haplotypes into a single representative. All reads were mapped to the reference with an alignment identity threshold of 0.95 using BBMap (BBMap tools). Genotype calling was performed using SAMtools 85 and BCFtools 86 . The variant call format (VCF) file was filtered to remove alleles with a population frequency of less than 3% and loci that were heterozygous in all samples or had more than 2 alleles in a sample, suggesting collapsed paralogs. The absence of artifacts was checked by counting SNPs at every nucleotide www.nature.com/scientificreports/ position read and verifying that the number of SNPs did not increase with reduced base quality toward the end of the read 87 .
Statistical and genetic analysis. VCFtools (https ://githu b.com/vcfto ols) 88 was used to perform the final SNP filtering of the genotype data. The criteria established for locus filtering were based on published data 83,87 : loci were filtered to assess minimum base quality (30), read depth (20), linkage disequilibrium to select a single independent SNP per locus, minimum allele frequency of 5% across all populations, exclusion of loci with more than 5% of missing data, and removal of indels. The filtered dataset was entered in PGDSpider to convert the data into STRU CTU RE 2.3.4 89 and Arlequin 3.5 90 formats. Filtered loci were analyzed to produce smoothed quantiles that enabled the relationship between F ST and heterozygosity in each locus to be determined. This analysis uses the raw empirical data 56 to identify outlier loci which can be under selection. The analysis was carried out with the R package fsthet 56,91 .
After filtering and identification of outlier loci, the resulting independent and neutral SNPs were used to characterize the populations genetically. Anopheles cruzii specimens collected in three different environments with different levels of anthropogenic changes (Natural, Suburban/Rural and Urban) were considered three different populations and its respective SNPs were then separated according to each of the five hypotheses tested. In the tests of each hypothesis, each sample being compared was considered a different 'population' . Data corresponding to each hypothesis were analyzed with the statistical and genetic analysis software. All analyses comparing tree-canopy and ground-level samples were carried out with specimens collected only in the year of 2016.
Observed heterozygosity (Ho), expected heterozygosity (He) and Hardy-Weinberg equilibrium (HWE) tests for all SNPs per population per hypotheses were performed in Arlequin v3.5 90 to investigate heterozygosis among the populations. To assess the genetic variance between populations, we used the genetic structure estimators D 57 , F ST and G'' ST 58 , and the inbreeding coefficient F IS were estimated globally in R using the StrataG package 92 . Pairwise estimates of D, F ST and G'' ST among all populations in hypothesis 2 were conducted in R using the StrataG package 92 to further investigate genetic structure between tree-canopy and ground-level populations. P values for all multiple tests were adjusted by applying the false discovery rate correction 93 using the function p.adjust in R 91 . To assess genetic diversity among sampled population, we used the multivariate principal component analysis (PCA) carried out with adegenet 94 and ade4 95 in R. Analysis of molecular variance (AMOVA) was performed in Arlequin 3.5 90 to detect population differentiation in all hypotheses. To test for isolation by distance for the populations in hypothesis 3, a Mantel test between genetic distance (F ST /(1 − F ST )) and geographic distance in kilometers was calculated with ade4 95 package in R 91 using 9,999 permutations.
Bayesian cluster analysis in STRU CTU RE was performed with the StrataG package in R to infer population structure between populations in all tested hypotheses. The analysis was carried out assuming the Admixture model and correlated allelic frequencies among populations [52][53][54] . Run length was set to 100,000 MCMC replicates after a burn-in period of 100,000. The number of clusters (K) varied from 1 to 10, with 10 replicates for each value of K 83 . The most likely number of clusters was determined by the ΔK method 96 , which identifies genetically homogeneous groups of individuals, used in Structure Harvester 97 .

Data availability
Data supporting the conclusions are included within the article. The dataset analyzed during the current study are available in the Mendeley Data repository, https ://doi.org/10.17632 /jd72b 5v8vv .1.