Introduction

Unveiling biotic evolution in a given region under a spatiotemporal framework helps to understand the origin of biodiversity on Earth. In this context, studies that combine phylogeographic approaches and hybrid zone inference can be crucial to understand the biogeographic processes underlying the origin of lineages and the evolutionary processes responsible for speciation, respectively. Specifically, phylogeographic studies allow one to infer whether landform and/or historical climatic changes have played a major role on the diversification of biotas (Cheviron et al. 2005; Avise 2009; Maldonado‐Coelho et al. 2013; Carnaval et al. 2014). Recent analytical advances in this field also opened the possibility of testing alternative models (e.g., those that include dispersal) to those based on pure vicariance in explaining large scale patterns of diversification across complex landscapes (Smith et al. 2014). In addition, detailed studies in areas of geographical overlap and admixture between genetically divergent populations allow the assessment of the role of selection on new gene combinations that could not be tested in the same genetic background before hybridization. In general, distinct demographic and evolutionary processes can potentially explain narrow clines of genetic and phenotypic traits. Some form of selection is often invoked as an important force hindering the spread of allelic variants across the hybrid zone (Endler 1977; Barton and Hewitt 1985; Barton and Gale 1993). Overall, loci/traits under strong selection and involved in reproductive barriers have limited introgression, resulting in abrupt gradients of allele frequencies and narrow clines. On the other hand, neutral loci or loci under weak selection have more gradual gradients of allele frequencies and broad clines. Thus, hybrid zone inferences are fundamental to understand the evolution of barriers to gene flow. For example, sigmoid/stepped clines and strong levels of linkage disequilibrium (LD) indicate a substantial amount of reproductive isolation (Szymura and Barton 1991; Vines et al. 2016).

In this context, a framework that integrates the possibility to reject at the same time the role of physical geographical barriers (e.g., rivers) versus historical climate changes (e.g., refuges) on population divergence in one hand, and whether hybrid zones originated in primary differentiation as opposed to secondary contact on the other, would be valuable to understand diversification patterns in the buildup of the biological diversity in a given region. More specifically, if genetically divergent sister populations or taxa exhibit signatures of population expansion over time at both sides of a geological barrier, this may imply that they have expanded their ranges historically, which does not allow one to reject a scenario of divergence in refuges (Moritz et al. 2000; Cheviron et al. 2005; Maldonado‐Coelho et al. 2013). Such approach is also relevant when testing alternative scenarios underlying the origin and dynamics of hybrid zones (Morales-Rozo et al. 2017). This has been a contentious issue; one model implies that hybrid zones are simply the result of contact and interbreeding between populations that differentiated genetically in geographic isolation. Alternatively, hybrid zones may have arisen in situ in response to spatial gradients of selection (Mayr 1942; Endler 1977; Harrison 1990; Durrett et al. 2000). In fact, different hybrid zones could have originated under either one of these scenarios and current patterns of genetic and phenotypic variation alone is usually insufficient to differentiate between them (Endler 1977, 1982). However, signatures of population expansion, as inferred from genetic evidence and by paleodistribution modeling, allow to test alternative scenarios. That is, a signature of population expansion could be used in favor of a secondary contact model whereas evidence of constant population size would be consistent with a primary intergradation model (Morales-Rozo et al. 2017). Hence, population expansion in association with other lines of evidence allows one to distinguish, with the same set of analyses, whether (i) the biogeographic processes responsible for the divergence of populations were related either to historical climatic changes or to the formation of a geographic barrier; (ii) the origin of hybrid zones is linked to a primary environmental gradient or if it originated following secondary contact between populations that differentiated in allopatry.

These evolutionary inferences regarding processes leading to population divergence and diversification are extremely valuable in the Neotropics, one of the regions with the highest levels of species richness in the world (Myers et al. 2000), and an understanding of patterns and processes that shape this biological diversity is not a trivial task for biologists. There, the highly diverse and threatened Atlantic Forest (AF) has been the focus of many studies. During the last years, several phylogeographical studies (Cabanne et al. 2008; Carnaval et al. 2009, 2014; Martins et al. 2009; Ribeiro et al. 2010; d’Horta et al. 2011; Maldonado-Coelho 2012; Amaral et al. 2013; Batalha-Filho and Miyaki 2016) have suggested that historical climatic changes (Haffer 1969; Vanzolini and Williams 1970) played a major role on the diversification of the AF biota during the Pleistocene, while other studies indicated other biogeographic processes as drivers of diversification. For example, rivers as barriers (Wallace 1852) were implied as being important in population divergence of lizards (Pellegrino et al. 2005; but see Cabanne et al. (2007) and Maldonado-Coelho (2012)). Additional studies also have suggested neotectonics as a possible diversification driver in some AF organisms (Silva and Straube 1996; Batalha-Filho et al. 2010; Brunes et al. 2010; Thomé et al. 2010, 2014; Amaro et al. 2012). Recently, Leite et al. (2016) presented an alternative model, in which small mammal species have expanded their ranges into the continental shelf during the last glacial maximum (LGM), therefore challenging the traditional scenario of population range fragmentation into distinct continental Pleistocene refugia. These competing scenarios portray a complex evolutionary history of the AF biota (Turchetto-Zolet et al. 2013), and imply that many distinct biogeographic processes have acted continuously since the Tertiary (Rull 2008, 2011).

In discrepancy with this increasing number of phylogeographic studies, an important and neglected issue regarding the evolution of the AF biota is the dynamics of contact and hybrid zones between lineages in some portions of the biome. Some areas of overlap between sister phylogeographic lineages have been identified for some organisms throughout the AF (Martins 2011), with some level of admixture in contact zones in or near river valleys (Maldonado-Coelho 2012). For instance, some authors found evidence for genetic admixture between frog lineages in the Doce and Jequitinhonha river valleys (Rhinella crucifer species complex—Thomé et al. 2012; Phyllomedusa burmeisteri species complex—Brunes et al. 2014; Dendropsophus elegans and Chiasmocleis carvalhoi—Tonini et al. 2013). In birds, contact zones were identified in Paraíba do Sul and Doce River valleys (Xiphorhynchus fuscus—Cabanne et al. 2007; Sclerurus scansor—d’Horta et al. 2011; Pyriglena leucoptera—Maldonado-Coelho 2012; Conopophaga lineata—Dantas et al. 2015) and in the Paraguaçu River Valley (between P. leucoptera and Pyriglena atra—Maldonado-Coelho 2012). Some of these contact zones are thought to have originated during secondary contact between populations that expanded their ranges out of refuges after the onset of warmer periods in the Quaternary. However, there have been no formal tests on the origin of contact zones in the AF. Moreover, some divergence times are older than the Quaternary and challenge the role of more recent (i.e., Pleistocene) historical climatic changes as drivers of population divergence (Thomé et al. 2010, 2012). In turn, this implies that either the same kind of biogeographical process is acting over broad temporal scales or that distinct mechanisms are at play in distinct evolutionary time intervals.

The Rufous-capped Spinetail species complex (Synallaxis ruficapilla) constitutes a good model to shed light on the biogeographic processes underlying the patterns of diversity and speciation in the AF. This suboscine passerine species complex is endemic to the AF (Pacheco and Gonzaga 1995), and comprises S. ruficapilla, the Bahia Spinetail (Synallaxis cinerea) and the Pinto’s Spinetail (Synallaxis infuscata). A taxonomic review based on plumage, morphometric data, and songs suggested that S. cinerea is best regarded as synonymous to S. ruficapilla, being the former only a geographic variant of S. ruficapilla (Stopiglia et al. 2013). However, Batalha-Filho et al. (2013) showed that there is a pronounced mitochondrial divergence (~3%) between these taxa. This study also suggests that the S. ruficapilla complex is not monophyletic. Although S. ruficapilla and S. cinerea form a sister group with high support, S. infuscata seems to be more closely related to Synallaxis moesta (from the Andes and the Tepuis mountain ranges), but this relationship shows weak statistical support. S. ruficapilla occurs from the central part of the AF in Brazil south to Paraguay and northeastern Argentina (Missiones) (Ridgely and Tudor 1994), while S. cinerea ranges from the central part of the AF north to the southern bank of the Paraguaçu River (pers. obs., Silveira 2008). Both species inhabit the understory of evergreen and semi-deciduous forest and are regularly found along forest edges (Ridgely and Tudor 1994; Sick 1997). Furthermore, Batalha-Filho et al. (2013) suggested a possible admixture zone between these species based on the pattern of co-occurrence of divergent mtDNA lineages. Thus, this species complex is a good model to understand the processes underlying diversification in the AF as divergent populations meet and hybridize along a river valley where phylogeographic breaks of other taxa have been described (i.e., in the Jequitinhonha River Valley; Batalha-Filho et al. 2012).

Here, we study this species complex to understand the patterns of diversification in the AF and the dynamics of a hybrid zone between two lineages along a major river system. We used mitochondrial and nuclear (autosomal and sex-linked) genes in a spatiotemporal framework to address the following questions: (i) What is the genetic structure of S. ruficapilla and S. cinerea across the AF? (ii) What is the location and extent of the hybrid zone between these lineages? (iii) Which biogeographic scenario could explain population divergence in this group? (iv) Did the hybrid zone originate by secondary contact or primary divergence? We hope that by combining phylogeographic, hybrid zone inferences, and ecological niche models (ENMs), insights will be gained into the biogeographic processes underlying the origin of the lineages and in the evolution of reproductive barriers that maintain their genetic integrity.

Material and methods

Sampling and molecular procedures

Tissue samples (muscle and blood) of 87 specimens of S. ruficapilla and 22 specimens of S. cinerea were sampled at 34 localities across their range in the AF (Fig. 1 and Table S1). Most samples (with two exceptions) are associated with vouchered specimens (Table S1). Molecular laboratory procedures of DNA extraction, amplification, and sequencing followed Batalha-Filho et al. (2012). We obtained sequences of two mitochondrial genes: cytochrome b (cytb) with primers L-14841 and H-16065 (Lougheed et al. 2000); and NADH dehydrogenase subunit 2 (ND2) with primers Lmet (Hackett 1996) and H6313 (Johnson and Sorenson 1998). We also sequenced two autosomal nuclear introns: β-fibrinogen intron 5 (FIB5) with primers FIB5 and FIB6 (Marini and Hackett 2002) and myoglobin intron 2 (myo2) using primers Myo2 and Myo3F (Slade et al. 1993; Heslewood et al. 1998, respectively). Some sequences (17 of cytb, 16 of ND2, 17 of FIB5, and 16 of myo2) for these genes were generated in a previous study (Batalha-Filho et al. 2013). To investigate the hybrid zone we generated sequences of three Z-chromosome-linked genes from a subset of 48 samples from the contact zone between S. ruficapilla and S. cinerea as indicated by the mtDNA variation (Fig. 2): BRM15 with primers BRM-15F and BRM-15R (Borge et al. 2005), CHDZ18 with primers CHDZ-18F and CHDZ-18R (Borge et al. 2005) and PLAA with primers PLAA-F and PLAA-R (Backström et al. 2010). PCR conditions for all genes followed Batalha-Filho et al. (2013), but the annealing temperatures of the Z-linked loci were different: BRM15—60 °C, CHDZ18—56 °C, and PLAA—57 °C. Primers sequences are given in Table S2.

Fig. 1
figure 1

a Localities sampled for Synallaxis ruficapilla and S. cinerea. The darker the area in the map, the higher the altitude. The colors of the localities indicate the mtDNA lineages recovered in the haplotype network and the phylogeny: black—southern S. ruficapilla; gray—northern S. ruficapilla; white—S. cinerea. b Haplotype network based on concatenated cytb (1001 bp) and ND2 (1041 bp) from 98 individuals with both sequences available. Each circle corresponds to one haplotype, and its size is proportional to frequency; each mark on lines connecting haplotypes refers to a mutational step. The smallest black circles depict median vectors. Colors correspond to mitochondrial phylogroups. Haplotype with asterisks depicts S. ruficapilla individuals that fall in the S. cinerea clade. c Bayesian phylogeny based on concatenated cytb and ND2 genes. Node supports are posterior probabilities of Bayesian inference

Fig. 2
figure 2

Haplotype networks (ac) and Structure results (d) based on Z-linked loci for S. ruficapilla and S. cinerea populations spanning the hybrid zone. Maps are showing distribution of segregating alleles of BRM15 (a, 356 bp), CHDZ18 (b, 286 bp), and PLAA (c, 587 bp) genes. Dashed lines in the maps depict the 2D hybrid zone transect (Table S2). d Plot of membership coefficients of the best clustering (k = 2) from Structure. Black bars represent S. ruficapilla and white bars indicate S. cinerea

As Synallaxis does not exhibit plumage sexual dimorphism, we performed molecular sexing of the 48 individuals from the hybrid zone to identify the number of gene copies of sex-linked (Z-linked) loci. We used M5 (Bantock et al. 2008) and P8 (Griffiths et al. 1998) primers. PCR (10 µL) contained template DNA (50 ng), 1× of Taq buffer (GE Healthcare), 2 mM of MgCl2, dNTPs (0.32 µM), 0.5 µM of each primer and 0.5 U of Taq polymerase (GE Healthcare). PCR conditions were an initial denaturation step at 94 °C for 8 min; followed by 40 cycles at 94 °C for 1 min, 54 °C for 30 s, and 72 °C for 30 s plus a final extension step at 72 °C for 2 min. PCR products were checked for size and quality of the band in 3% agarose gels.

Sequence edition, alignment, phasing, and recombination

Electropherograms were inspected and assembled in contigs using CodonCode Aligner v. 3.7 (CodonCode Inc.). Heterozygous sites of nuclear loci were coded according to IUPAC code when double peaks were present in both strands of the same individual’s electropherograms. Nuclear sequences that contained heterozygous indels were analyzed using the algorithm Process Heterozygous Indels in CodonCode Aligner v. 3.7. Sequences were aligned using the CLUSTAL W method (Higgins et al. 1994) in MEGA5 (Tamura et al. 2011). All alignments were inspected and corrected visually. The gametic phase of heterozygote individuals was resolved using the algorithm PHASE (Stephens et al. 2001) with default settings in DnaSP 5 (Librado and Rozas 2009) and 0.6 as the minimum probability. Individuals with lower probabilities were removed from further analyses. We used the PHI test in SPLITSTREE4 (Bruen et al. 2006; Huson and Bryant 2006) to check for recombination in nuclear loci. This test was used due to its power to distinguish recombination events from homoplasy (Bruen et al. 2006).

Population structure tests

We generated median-joining networks (Bandelt et al. 1999) using PopART 1.7 (Leigh and Bryant 2015) for each locus to infer the relationships between haplotypes and their geographic distribution. For further analyses, we concatenated mtDNA genes (ND2 and cytb). Z-linked genes were not included in coalescent demographic analyses (EBSP and IMa2) because they may be under selection and these analyses assume neutral variation.

We reconstructed the gene genealogies of populations and taxa based on the mtDNA genes concatenated using a Bayesian inference with MrBayes 3.2.3 (Ronquist et al. 2012) at the CIPRES Science Gateway (Miller et al. 2010). The best fit model for each gene (GTR+I for cytb and HKY+G for ND2) was selected using MrModeltest 2.2 (Nylander 2004) based on the Akaike Information Criterion (AIC) in conjunction with PAUP* (Swofford 1998). To root the tree we used Synallaxis cabanisi (GenBank accession numbers: cytb, KC437438; ND2, KC437514), following Batalha-Filho et al. (2013). We performed two independent Bayesian runs with 10 million of generations and four chains of Markov chain Monte Carlo (MCMC) each. The first one million generations were discarded as burn-in, after which trees were sampled every 500 generations. Chain convergence (effective sample size—ESS values > 200) was checked using the likelihood plots for each run with Tracer 1.6 (http://beast.bio.ed.ac.uk/Tracer). The potential scale reduction factor was also used to check chain convergence and burn-in; values close to one indicate good convergence between runs (Gelman and Rubin 1992). Post burn-in trees were summarized in a 50% majority-rule Bayesian consensus tree. Tree was visualized in FigTree 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/).

To assess the level of population genetic structure among localities we performed an analysis of molecular variance (AMOVA, Excoffier et al. 1992) with three hierarchical levels for each locus using ARLEQUIN 3.5.1.2 (Excoffier and Lischer 2010). We also estimated the fixation indices ΦCT (structure among groups) and ΦST (structure among localities within groups). For the third level of this analysis, we considered two groups comprising each species, and four groups comprising mitochondrial phylogroups, as indicated in Fig. 1. Statistical significance was obtained with 1000 permutations. For this analysis, we used only localities with at least two samples.

Historical demography

We estimated nucleotide diversity per site (π) and number of haplotypes (h) for each locus using DnaSP5. To test for historical demographic size changes, we applied the neutrality test Fu’s Fs (Fu 1997) and the R2 test (Ramos-Onsins and Rozas 2002) in DnaSP5. Significance levels were obtained based on 10,000 coalescent simulations.

We also inferred the population size dynamics through time for all loci combined by using the Extended Bayesian Skyline Plot method (EBSP; Heled and Drummond 2008) implemented in BEAST 1.6.2 (Drummond and Rambaut 2007). This method reduces the stochastic effect from the coalescent process by combining different genes trees in a non-parametric approach of coalescence to estimate fluctuation of the effective population size through time (Ho and Shapiro 2011). Two independent EBSP runs were obtained with 150–250 million of steps of MCMC for each mitochondrial phylogroup using the following parameters: an initial UPGMA tree, a linear model, parameters sampled every 10,000 steps and a burn-in of 20%. The hLRT test implemented in MEGA5 did not reject the molecular clock hypothesis (p < 0.05) for all loci in both clades, except for mtDNA in the southern clade and in myo2 for the northern clade of the S. ruficapilla. Thus, we applied an uncorrelated lognormal relaxed clock prior in partitions that the hLRT rejected the clock and a strict clock prior for the remaining loci. The best-fit model for each gene was selected using MrModeltest 2.2 based on the AIC in conjunction with PAUP*. We used 1.05% (s.d. 0.05%) per lineage per million years as the mutation rate for mtDNA (Weir and Schluter 2008) under a normal prior distribution. Based on the probabilities obtained, BEAST estimated the mutation rate for nuclear introns individually under a default lognormal prior distribution. To check the convergence of parameters between runs and the analysis performance (ESS values > 200) we used TRACER 1.6. After removing burn-in we combined independent runs for each phylogroup by using LogCombiner.

Isolation and migration

We obtained estimates of divergence time and gene flow among lineages of S. ruficapilla complex based on a data set including all loci under the multilocus Bayesian coalescent method of isolation with migration (Nielsen and Wakeley 2001; Hey and Nielsen 2004) that is implemented in the software IMa2 version 27/08/2012 (Hey 2010). IMa2 implements the isolation with migration method for more than two populations in the same analysis. Then, we implemented this run for the three well supported lineages of S. ruficapilla and S. cinerea (coastal clade) (Fig. 1). The Chapada Diamantina clade of S. cinerea was not included in this analysis, as only mtDNA was obtained for this group. In the analysis with three populations, the software estimates 13 demographic parameters (Fig. S1): effective population sizes of each population (θ1, θ2, and θ3) and of their ancestral populations (θ4 and θ5), asymmetrical migration rates among current populations (m1→2, m2→1, m1→3, m3→1, m2→3, and m3→2) and divergence time among populations (t0 and t1). We used the HKY model for all loci and inheritance scalars of 0.25 and 1.0 for the mtDNA and nuclear loci, respectively. Migration rates were estimated only for modern populations, assuming zero migration for ancestral populations. We used the topology recovered by MrBayes (Fig. 1): ((southern ruficapilla, northern ruficapilla), coastal cinerea). We allowed the program to estimate independent values of θ and m for each population. We performed many preliminary runs to adjust the maximum value of the prior distribution for each parameter and the number and heating scheme of MCMCs. The analyses were then performed with 80 MCMCs and geometric heating (ha = 0.999 and hb = 0.3). The maximum values of priors for the demographic parameters were θ = 25, m = 10, t = 8. We implemented 4 runs with different random seeds with 5 million generations of MCMC each, and having the first million discarded as burn-in. We assumed a generation time of 1 year as adopted in Cabanne et al. (2008). The parameters were scaled in demographic units using the same mutation rate of mitochondrial genes implemented in EBSP analyses. A mutation rate of 0.135% per site per lineage per million years for nuclear introns was applied (Ellegren 2007). We assumed errors in the mutation rates as priors in the calculation of scalar mutation of 0.96–1.13% and of 0.12–0.15% for mtDNA and nDNA, respectively (Ellegren 2007; Weir and Schluter 2008). To check the convergence of parameters in each run, we made sure that all ESS values were higher than 50.

Hybrid zone analyses

To evaluate levels of introgression between S. ruficapilla and S. cinerea in a subset of 48 samples for Z-linked genes (Fig. 2), we generated median-joining networks using PopART 1.7 for each locus, Bayesian clustering with admixture model and cline model fitting. We used the software Structure 2.3.4 (Pritchard et al. 2000) to test the number of clusters (k) ranging from one to five and repeating each run 5 times. Each run consisted of 5 million generations of MCMC, discarding the first 500,000 generations as burn-in. We used the admixture models to evaluate the level of introgression between species through sampled localities. The most likely estimate of K was obtained using Evanno’s method (Evanno et al. 2005) implemented in Structure Harvester (Earl and vonHoldt 2012). Runs for the most likely K were combined using Clumpp 1.1.2 (Jakobsson and Rosenberg 2007).

For cline fitting we considered only the two lineages of S. ruficapilla and S. cinerea in geographic contact, that is, the southern and northern clades of the former and the coastal clade of the latter. The northern clade of S. ruficapilla was not recovered based on Z-linked loci (see Results). Thus, northern and southern clades of S. ruficapilla were grouped together in cline fit analyses based on Z-linked loci. Additionally, these clades were also combined in a cline analysis for mtDNA, as they share many segregating sites in mitochondrial genes. Cline fit inferences were based in SNP frequencies along the transect. For each locus we identified a segregating SNP in samples from parental populations with frequency higher than 0.8 at the southern tail and lower than 0.2 at the northern tail. The only exception was the CHDZ18 locus that did not present high frequency of any segregating SNP (maximum 0.44) at the southern tail of the transect. The segregating SNPs selected for each locus are as follows: mtDNA—site 95 in cytb, site 723 in ND2; site 110 in CHDZ18; site 319 in BRM15; site 180 in PLAA.

A similar approach to previous analyses (Macholán et al. 2007) was used to reduce the dimensionality of the hybrid zone transect. A smoothing procedure was performed in SigmaPlot 13.0 using the 3D Contour Plot, in which each site was defined by a combination of its geographical coordinates and average allele frequency (over all loci). The zone centre was defined as the 0.5 isocline and a linear transect was generated along a long axis of accumulating distances from the southernmost sampling site (locality 1; Fig. 2) to the intersection of shortest perpendicular straight-lines drawn from the long axis to each site.

Tests on the deviation from the Hardy–Weinberg equilibrium were performed for all Z-linked loci in GENEPOP (http://genepop.curtin.edu.au/; Rousset 2008). Population coefficient of ancestry (q) estimated in Structure for all loci with k = 2 and population SNP frequencies for each of the Z-linked and the mtDNA loci were fit to equilibrium geographic clines (Szymura and Barton 1986; Gay et al. 2008) using the package hzar (Derryberry et al. 2014) in R 3.4.4 (R Core Team). The MCMC algorithm was employed in a set of 15 models available in hzar (Derryberry et al. 2014), which differ in estimated cline shape parameters. All models estimated cline centre (c, in km from sampling locality 1, Fig. 2) and cline width (w, 1/maximum slope). In addition, models varied in (i) the fit of exponential decay tail parameters δ and τ (none, mirrored tails, both tails separately, right tail only and left tail only), where δ and τ represent the distance from the cline centre to the tail and the tail slope, respectively and (ii) in the scaling of maximum (pmax) and minimum (pmin) allele frequencies (fixed or free to vary); i.e., some models had no scaling (pmin = 0, pmax = 1), fixed scaling (pmin and pmax using observed minimum and maximum frequencies) or allowed pmin and pmax to be estimated. These models were compared to a null model with no clinal transition. In all models, we constrained the MCMC to sample the geographic distance encompassed by the length of our transect. Three independent chains for 1.0 × 106 generations were run for each model. The fit of these 15 models was then compared using AIC corrected for small sample size (AICc). Adequate mixing and convergence in cline parameter estimation were evaluated by visualizing MCMC sampling trajectories. Coincidence and concordance among loci clines were evaluated using two log-likelihood confidence intervals from the best cline model for each locus. Non-overlapping confidence intervals were taken as evidence of differences in cline position and width.

In addition, these clines were contrasted to a model of neutral diffusion (Endler 1977; Barton and Gale 1993) that estimates the width of a cline under no selection or barriers to gene flow: w = 2.51σt. Here, the width of a cline (w) is a function of the number of generations since secondary contact (t) and parent-offspring dispersal distance (σ). A neutral cline wider than the clines estimated for a given trait indicates that selection is preventing genetic homogenization of populations. The mean natal dispersal distance of 0.931 km estimated for another suboscine bird (Woltmann et al. 2012) was employed. The time since secondary contact was based on our paleomodeling showing a northward expansion of the southern species during the LGM (ca. 0.021 mya). Generation time was assumed to be 1 year as above.

Inferring the paleodistribution

To infer the distribution of AF spinetails through the late Quaternary we generated ENMs using the package BIOMOD (Thuiller et al. 2009) in R 3.4.4. We implemented the Ensemble forecasting approach (Araújo and New 2007) that combines different modeling algorithms while considering for variation among their results and statistical support for each model. We implemented all 10 algorithms available in BIOMOD: Generalized Linear Model, Generalized Additive Model, Generalized Boosting Model, Classification Tree Analysis, Artificial Neural Network, Surface Range Envelop—BIOCLIM, Flexible Discriminant Analysis, Multiple Adaptive Regression Splines, Random Forest and Maximum Entropy. The occurrence points were the same as used for genetic sampling (Table S1), as for these records we could confirm whether localities are from S. ruficapilla or S. cinerea based on the genetic data. We generated ENMs using three different strategies to take into account possible niche divergence between species: combined records of S. ruficapilla and S. cinerea (34 localities); only records of S. ruficapilla (27 localities); and only records of S. cinerea (7 localities). Localities with admixture between species were also included and assigned to either species based on the species with the highest frequency of mtDNA lineages. We gathered environmental variables from WorldClim online database v1.4 (www.worldclim.org/bioclim). Based on a factor analysis on a correlation matrix to minimize collinearity problems when constructing models (Terribile et al. 2012) we selected seven out of 19 Bioclim variables available: mean diurnal range, temperature seasonality, maximum temperature of warmest month, precipitation of wettest month, precipitation of driest month, precipitation seasonality, and precipitation of warmest quarter. Bioclim variables were used in 2.5 arc-min resolution (approximately 4.5 km at the Equator) and cropped to South America extent (between latitudes 14°N and 57°S, and longitudes 30°W and 85°W). We used layers for (i) current conditions, (ii) Medium Holocene (MH, ~0.006 mya), (iii) LGM (~0.022 mya), and (iv) Last Interglacial (LIG; ~0.12–0.14 mya). As MH and LGM periods have more than one Atmospheric Oceanic Global Circulation Model (AOGCM) available, we restricted our AOGCM dataset to those present in both these periods: CCSM4, MIROC-ESM, and MPI-ESM-P, developed through Coupled Model Intercomparison Project Phase 5 (CMIP5) (http://cmip-pcmdi.llnl.gov/cmip5/). We calculated the mean value for each cell in the spatial variable of different AOGCMs for a period using the raster package (Hijmans and van Etten 2014) in R 3.4.4. We chose this approach in order to yield a single paleodistribution model for each period (instead of a different hypothesis for each AOGCM). Model calibration was based on 20 replicates for each algorithm: the first ten replicates randomly split occurrence points into training data (75%) and testing data (25%) for model evaluation (partitioned models) and the remaining ten replicates were performed using the total data set (full models). Evaluation was based on the TSS (True Skill Statistic; Allouche et al. 2006), and full models presenting TSS > 0.9 were then used to generate the final ensembled outputs. Paleoclimate scenarios were generated by projecting layers of the past on the current distribution model. Resulting grid layers were visualized in DIVA-GIS 7.5 (https://www.diva-gis.org/).

Results

Sequence features

We obtained 1001 bp for cytb (N = 99), 1041 bp for ND2 (N = 93), 525 bp for FIB5 (N = 97), and 601 bp for myo2 (N = 89). In addition, we obtained sequences for all 48 samples from the contact zone between S. ruficapilla and S. cinerea of three Z-linked genes: 356 bp of BRM15, 286 bp of CHDZ18, and 587 bp of PLAA. In the mtDNA genes, 66 and 50 variable sites were respectively observed for cytb and ND2, and no indels were detected in these genes. We did not detect the presence of stop codons, indicating their mitochondrial origin. We observed 29 and 28 polymorphic sites in the autosomal introns FIB5 and myo2, respectively. We did not observe indels in myo2, but in FIB5 we found a 7 bp indel in five individuals. For the Z-linked genes we observed five, ten, and eight variable sites for BRM15, CHDZ18, and PLAA, respectively. We observed a 1 and a 5 bp long indel in two samples in the BRM15 gene. Indels were not considered in further analyses.

PHASE resolved alleles with high probability for all nuclear loci (p > 0.6 for autosomal loci; p > 0.9 for Z-linked loci). PHI test did not find significant evidence of recombination in any nuclear loci (FIB5, p= 0.939; myo2, p = 0.221; BRM15, p = 1.0; CHDZ18, p = 0.122; PLAA, p = 1.0).

Population structure

The Bayesian phylogenetic tree based on mtDNA data evidenced four clades with high support (Fig. 1c): (1) a southern clade of S. ruficapilla from the south to the central AF; (2) a northern clade of S. ruficapilla from the central region of the AF that is geographically restricted at the interfluvium of Doce and Jequitinhonha rivers; (3) a coastal clade of S. cinerea from the Jequitinhonha River Valley in central AF to the northern range of this species; and (4) an inland S. cinerea clade from the forests flanking the Chapada Diamantina Mountain Range (hereafter Chapada Diamantina clade). Haplotypes of the northern clade of S. ruficapilla were found at many localities where there are also haplotypes of both the southern clade of S. ruficapilla and the coastal clade of S. cinerea. The mitochondrial haplotype network (Fig. 1b) recovered the same population structure observed in the Bayesian tree. Haplotype networks based on autosomal loci did not detect a population structure as identified by the mtDNA (Fig. S1).

Our study recovered a contact zone among the coastal clade of S. cinerea and the two clades of S. ruficapilla at the interfluvium of the Doce and Jequitinhonha rivers (Figs. 1 and 2). Specifically, the mtDNA haplotypes of S. ruficapilla and S. cinerea co-occur in four localities along this contact zone (localities 3, 4, 5, and 7; Figs. 1 and 2). Furthermore, in one locality (5 in Figs. 1 and 2) all the three mtDNA lineages co-occur, while in the other three localities we either observed in sympatry only haplotypes of the northern and southern S. ruficapilla or northern S. ruficapilla and costal S. cinerea clades.

The AMOVA results based on mtDNA significantly supported two and four groups (k = 2, k = 4) in the third hierarchical level as the best population clustering for the S. ruficapilla complex (Table 1). Nevertheless, clustering considering four groups retained more genetic variation in the third level, which is in accordance with the phylogenetic and haplotype network results (Fig. 1). AMOVA for autosomal introns did not recover any signal of population structure, as most of the variation was observed within localities (Table 1).

Table 1 Analysis of molecular variance (AMOVA) of S. ruficapilla and S. cinerea lineages

Historical demography

Summary statistics for all markers are shown in Table 2. The R2 test indicates population expansion based on mtDNA and FIB5 for the southern clade of S. ruficapilla (Table 2). The Fs test indicated population expansion for all S. ruficapilla clades based on mtDNA and FIB5. However, no signature of population expansion was recovered for the myo2 locus. Inference based on all loci combined in EBSP indicated population growth for all clades during the late Pleistocene (Fig. 3): the southern and northern clades of S. ruficapilla respectively at ca. 0.05 and 0.10 mya, and the coastal clade of S. cinerea at ca. 0.02 mya. However, confidence intervals of all estimates overlapped, which indicate that population growth is best regarded between 0.1 and 0.02 mya in all clades. EBSP for the Chapada Diamantina clade of S. cinerea was not generated as only mtDNA data was available for this group.

Table 2 Summary statistics for clades of S. ruficapilla and S. cinerea
Fig. 3
figure 3

Historical demographic estimates from Extended Bayesian Skyline Plot (EBSP) for lineages of S. ruficapilla and S. cinerea: a southern clade of S. ruficapilla; b same plot but zoomed on the last 0.20 mya; c northern clade of S. ruficapilla; d coastal clade of S. cinerea. The line in the middle shows the median estimate of the EBSP, and the upper and lower lines show the upper and lower 95% highest posterior density limits, respectively. The Y-axis is in ln scale. The X-axis represents time in millions of years

Isolation with migration

Our estimates of divergence time between clades of the S. ruficapilla complex obtained by IMa2 (Fig. S2 and Table 3) indicated that diversification took place during the late Pliocene to the Pleistocene. The oldest divergence, between S. cinerea and the ancestor of S. ruficapilla clades, occurred at ca. 0.902 mya (95% highest posterior density [HPD]: 0.541–3.461 mya). The split between the southern and northern clades of S. ruficapilla occurred at ca. 0.721 mya (95% HPD: 0.077–1.885 mya). Migration rate estimates were higher between the southern and the northern clades of S. ruficapilla. Migration rates were asymmetrical between the southern clade of S. ruficapilla and the coastal S. cinerea clade, being higher from S. cinerea into S. ruficapilla (Fig. S3 and Table 4). Our estimates of effective population size (Ne) evidenced that the southern clade had a higher Ne than the other clades. However, Ne likelihood curves of S. cinerea and S. ruficapilla overlapped (Fig. S2 and Table 3).

Table 3 Estimations of effective population size (number of individuals) and divergence time (in years) by IMa2
Table 4 Estimations of migration rates (2Nm: number of migrants per generation) estimated by IMa2. Parameters are according to Fig. 5

Hybrid zone

Haplotype networks and the Structure results based on the Z-linked genes recovered two lineages that corresponded to S. ruficapilla and S. cinerea (Figs. 2 and S4). The mtDNA subdivision of S. ruficapilla in southern and northern clades was not recovered by the Z-linked genes. These results confirmed the presence of a hybrid zone between the two species, which is also located near the Jequitinhonha River (Fig. 2 and Table S3). In addition, we found heterozygous males in localities 7–9 that exhibit alleles from both species in BRM15 and PLAA loci (three in PLAA from locality 7; four in BRM15 from localities 7–9). Structure results indicated a higher level of admixture in locality 7, with an abrupt decrease in introgression levels away from this population (Fig. 2d).

AICc indicated Model 1 (with fixed scaling for pmin and pmax and no tail parameters estimated) as the best-fit model for all loci. Clines for mtDNA and Z-linked loci BRM15 and PLAA were coincident (i.e., same centre) with overlapping log-likelihood confidence intervals (Fig. 4 and Table S4): mtDNA c= 462.92 (397.74–551.90) km; BRM15 c = 529.25 (475.65–593.93) km; PLAA c = 457.54 (413.10–485.57) km. These results were corroborated by the estimated cline centre for the coefficient of ancestry qc = 435.85 (370.93–486.06) km. Corroborating the Structure results, cline centre for these loci were around locality 7 (Figs. 2 and 4). Clines of these three loci were also concordant (i.e., overlapping widths). However, the Z-linked gene PLAA showed the narrowest cline: mtDNA w= 197.66 (95.04–415.21) km; BRM15 w = 179.37 (96.42–329.94) km; PLAA w = 28.32 (0.61–196.05) km. Cline width for the coefficient of ancestry q also presented a similar pattern and overlapped with all loci: w = 78.29 (2.26–279.34) km. Confidence intervals for cline centre of the CHDZ18 gene did not overlap with those from the other loci (Fig. S5), with a mean cline centre displaced southwards: c = 218.12 (21.43–345.82) km. This locus had a smoother allelic transition and a broader cline relative to the other loci: w = 403.20 (62.94–779.92) km. The expected cline width estimated under a neutral diffusion model was 339 km wide.

Fig. 4
figure 4

Maximum likelihood clines for the Z-linked loci and mtDNA across the hybrid zone of Atlantic Forest spinetails (Synallaxis ruficapilla and S. cinerea). Clines depict frequency of a given S. ruficapilla allele as a function of an accumulating distance from the southernmost site (locality 1, Fig. 1). Larger values of the coefficient of ancestry represent a higher proportion of genetic ancestry from S. ruficapilla. Distinct colors correspond to mtDNA (green), BRM15 (red), PLAA (purple), CHDZ18 (gray), and coefficient of ancestry (black). The maximum likelihood 95% credibility intervals of these clines are presented in Fig. S5

Paleodistribution of Atlantic Forest spinetails

Our ENMs exhibited good accuracy, as evidenced by true skill statistic values (TSS ≥ 0.9). ENMs for combined species produced similar results of models only using S. ruficapilla records, as models only with S. cinerea records generated maps with suitable areas in the southern AF, where this species does not occur, in all periods (Fig. 5). While current models for S. ruficapilla and combined species were similar to known modern distribution of AF spinetails, the current model of S. cinerea overestimates its distribution (Fig. 5). Despite differences in ENMs for combined and individual species, all models were consistent with a scenario in which S. ruficapilla and S. cinerea were isolated during the LIG and resumed secondary contact during the LGM (Fig. 5).

Fig. 5
figure 5

Ecological niche models (ENMs) of Atlantic Forest spinetails (records for both species and those from each species separately) for current time, mid Holocene (MH), last glacial maximum (LGM), and last interglacial (LIG). Warmer colors indicate a higher probability of species occurrence, as depicted in the legend. Dashed lines represent Doce (further south) and Jequitinhonha (further north) Rivers

Discussion

Diversification of Atlantic Forest spinetails

Our analyses based on mtDNA data revealed the presence of four lineages within the S. ruficapilla species complex (0.5–2.8% divergence; Fig. 1 and Table 1), with divergence times taking place during the late Pliocene to Pleistocene (Fig. S2 and Table 3). Z-linked genes corroborated the genetic differentiation between the two species, with a hybrid zone found along the Jequitinhonha River Valley. Despite morphological and vocal overlap between these species (Stopiglia et al. 2013), our analyses consistently support a clear genetic divergence between the two taxa described in an earlier analysis (Batalha-Filho et al. 2013). The split between the southern and northern clades in S. ruficapilla was not recovered based on autosomal (Fig. S1 and Table 1) and Z-linked genes (Fig. 2), probably due to recent population divergence. Also, the genetic divergence between the two species observed in mtDNA (Fig. 1) and Z-linked (Fig. 2) genes was not recovered in autosomal loci (Fig. S1). In theory, we expect weak or no divergence in autosomal genes due to incomplete lineage sorting for recent divergences (Palumbi et al. 2001; Zink and Barrowclough 2008). Additionally, differences in effective population sizes between mtDNA, sex-linked and autosomal loci could have led to a faster rate of fixation of mtDNA and Z-linked alleles due to drift and differential introgression in these loci between the two species could have contributed to genetic homogenization in autosomal loci (i.e., Haldane's rule; Haldane 1922; Borge et al. 2005; Macholán et al. 2007; Carling and Brumfield 2008, 2009).

The geographic distribution of lineages partially concurs with previous reports of range distributions and phenotypic diagnoses for S. ruficapilla and S. cinerea (Pacheco and Gonzaga 1995; Ribon et al. 2002; Vasconcelos and Silva 2004). However, the cryptic mtDNA lineage of S. ruficapilla from central AF (i.e., the northern clade, Fig. 1) does not coincide with any previously described phenotypic variation. This clade seems to be restricted to forests along the Espinhaço Mountain Range in the middle of the Jequitinhonha River Basin. On the other hand, while the Chapada Diamantina clade of S. cinerea exhibits low (0.5%) mitochondrial divergence from the coastal clade (Fig. 1), it seems to present distinct vocal variation (S.S. Santos et al. in prep). Additional analyses with a dense vocal sampling are needed to investigate this issue.

Our results were not consistent with the role of Jequitinhonha River as a barrier responsible for the split between S. ruficapilla and S. cinerea (Vasconcelos and Silva 2004), since we observed (i) mtDNA haplotypes and Z-linked alleles of S. cinerea in both banks of this river, (ii) Z-linked alleles of S. ruficapilla in both banks of the river, and (iii) paleodistribution models indicate range fragmentation and isolation due to historical climatic change events. While Amazonian rivers play an important role in avian population divergence (Ribas et al. 2012; Fernandes et al. 2012; Maldonado‐Coelho et al. 2013), AF rivers seem to be more permeable barriers to gene flow (Cabanne et al. 2007; Maldonado-Coelho 2012). In a scenario of a river as a primary barrier one would expect absence of population expansion in populations flanking the river (Cheviron et al. 2005) and a temporal congruence between population divergence and the origin of the current river course (Moritz et al. 2000). Thus, evidence from the geographic distribution of genetic variation and paleomodeling (Figs. 1, 3, and 5) were not consistent with rivers as drivers of differentiation of S. ruficapilla complex, but were consistent with climatic oscillations as drivers of population divergence in this group. However, an alternative and yet untested explanation is that the hybrid zone between the two spinetails, as well as other contact zones in the AF, rest on population density troughs (Barton 1979; Barton and Turelli 2011). Hence, contact zones coinciding with rivers valleys in AF could simply reflect a low population density region in which hybrid zones become “trapped”. Future demographic studies should address this issue.

The presence of a hybrid zone in the Doce-Jequitinhonha interfluve, as indicated by evolutionary relationships (Bayesian tree and haplotype network, Fig. 1), three independent population genetic analyses (IMa2, Structure and cline fitting, Figs. 2, 4, S3, and S5; Tables 4 and S3), the signature of late Pleistocene demographic expansion (Fig. 3) and ENMs (Fig. 5), seems to be consistent with a scenario of isolation and divergence during the late Pleistocene (last 0.1 mya). Indeed, our results of population size changes (between 0.1 and 0.02 mya) and ENMs were consistent with a scenario of fragmentation during the LIG and expansion in the LGM, which is in accordance with a recent hypothesis that postulates population growth during the latter period (Leite et al. 2016). According to this scenario, AF small mammal species expanded their range into the emerged continental shelf during the LGM, instead of the traditional view of population fragmentation during this period (Haffer 1969). Population growth during the LGM has also been documented for other AF passerines (Batalha-Filho et al. 2012; Cabanne et al. 2013).

The role of Quaternary climatic changes in shaping diversification in S. ruficapilla and S. cinerea might also be advocated to explain the evolutionary history of this group. Indeed, the refugia hypothesis has not been rejected by many studies of AF organisms (e.g., Cabanne et al. 2007, 2008; Carnaval et al. 2009; Martins et al. 2009; Ribeiro et al. 2010; d’Horta et al. 2011; Maldonado-Coelho 2012; Batalha-Filho and Miyaki 2016). Our data revealed Plio-Pleistocene splits and recent (last 0.1 mya) demographic expansion in all lineages (Fig. 3), which is in accordance with expectations of the refugia hypothesis (Moritz et al. 2000). However, climatic oscillations extend back into the Tertiary (Haffer 1997; Jansson and Dynesius 2002). Thus, we suggest that isolation and divergence due to forest fragmentation during the last ca. 2 mya is the most likely scenario explaining the evolution of S. ruficapilla and S. cinerea in the AF.

Hybrid zone dynamics

Clustering analysis based on Z-linked genes recovered two groups that correspond to the two spinetail species (Fig. 2). Also, this and cline fitting results indicate that the centre of the hybrid zone lies either near or at locality 7 (Fig. 2). Finally, these analyses suggest that introgression is somewhat geographically limited, despite hybrids being common at this locality.

There is an inverse relationship between cline width and the amount of selection on a given loci (Endler 1977; Barton 1983; Barton and Gale 1993), and cline width is a valuable information to infer which traits are involved in barriers to gene flow. The PLAA, BRM15, and CHDZ18 genes have been shown to be under selection in hybrid zones of other passerine birds (Borge et al. 2005; Backström et al. 2010; Elgvin et al. 2011), and particularly the narrower cline of the PLAA gene (Fig. 4 and Table S4) concurs with the idea of sex chromosomes being involved in reproductive isolation in vertebrates (e.g., Macholán et al. 2007; Qvarnström and Bailey 2009; Irwin 2018). Indeed, as the PLAA gene had the narrowest cline, it is possible that this locus is under stronger selection. A previous study suggested that this gene is likely under diversifying selection and often has higher divergence rates than other Z-linked loci (Backström et al. 2010); thus, this gene might not function well in a hybrid background. Thus, even though the log-likelihood confidence intervals of loci overlap, the narrower cline of the PLAA gene may suggest differential selection acting on different regions of the Z chromosome in the spinetail hybrid zone. In fact, distinct introgression patterns in different sex-linked genes seem to be a common feature in hybrid zones (Carling and Brumfield 2009; Macholán et al. 2011). On the other hand, CHDZ18 had a broader cline than the other two Z-linked genes and it is possible that the fact that it is not fixed at the end of the transect is due to either retention of ancient polymorphism or introgressive hybridization due to weaker selection acting on this gene.

Our results of cline slope and position suggest that the hybrid zone is likely maintained by a balance between selection and dispersal (Barton and Hewitt 1985; Gay et al. 2008). Although both endogenous and exogenous selection will result in a balance between selection and dispersal and consequently clines of similar shape (Barton and Gale 1993), we suggest that some form of endogenous selection underlies the width and position of this hybrid zone (i.e., the tension zone model), as there is no evidence of any apparent environmental gradient (pers. observ.). In this regard, post-zygotic selection likely maintains the width of the PLAA cline. Alternatively, the cline width observed for this locus could be the result of a recent admixture event (Endler 1977). However, the PLAA cline is more than 100 km narrower than a cline estimated under a neutral diffusion process (Table S4), thereby providing further evidence that some form of selection is hindering genetic homogenization.

Our reduced sample size prevents an estimation of LD among loci. However, as clines of distinct loci were both coincident and concordant, it is possible that these loci are statistically associated. In our analyses, the simple sigmoid cline model (Model 1; only centre and slope fitted) was favored over stepped cline models (i.e., with centre, slope, and exponential tails fitted) for all loci. This may suggest two things. First, LD may be responsible to have brought these clines together (except the CHDZ18 locus) into geographic coincidence and concordance. When alleles at different loci come into LD, clines tend to be spatially clustered and selection tends to act on them as a unit, augmenting selection against hybrids and reducing cline width (Bazykin 1969; Barton 1983; Szymura and Barton 1991; Nürnberger et al. 1995; Kruuk et al. 1999). Second, selection pressures on each individual locus may also have played a significant role, as loci influenced by this process are expected to have sigmoid rather than stepped clines (Barton 1983; Vines et al. 2016). Overall, these results suggest that (i) both the indirect effect of loci in LD and the effect of direct selection on individual loci may influence the fate of the spinetail hybrid zone and that (ii) the Jequitinhonha River is a permeable barrier to dispersal, as strong geographic barriers are expected to produce stepped clines (Barton and Hewitt 1989; Kawakami and Butlin 2012).

Geographic origin of the hybrid zone

The primary divergence and secondary contact scenarios were proposed as alternatives to account for the origin of hybrid zones long ago (Mayr 1942), but relatively few studies have contrasted them in a spatial–temporal framework. Most of the empirical evidence that hybrid zones arose through secondary contact comes from studies in the temperate zone (Barton and Hewitt 1985; Swenson 2008), but recent analyses also provide evidence supporting this model for the Neotropical region (Arias et al. 2012; Morales-Rozo et al. 2017). In our study, signatures of population expansion were found in both ENM (Fig. 5) and historical demographic analyses (Fig. 3 and Table 2) and suggest that the spinetail hybrid zone arose as secondary contact between populations that differentiated in allopatry in distinct refugia. Although the credibility interval associated with the divergence time is wide (Pleistocene to late Pliocene, Table 3), and that previous secondary contact periods not modeled here are likely, our paleodistribution models sets a minimum and recent age for the origin of the spinetail hybrid zone (i.e., during the LGM). This scenario is also consistent with evidence based on fossil pollen (e.g., Behling 1995, 1997, 2002; Behling and Negrelle 2001), phylogeographic studies (e.g., Cabanne et al. 2008; Maldonado-Coelho 2012; Batalha-Filho and Miyaki 2016), and other paleodistribution analyses (e.g., Carnaval et al. 2014; Leite et al. 2016), which indicate climatically induced fragmentation and range shifts of the AF itself and its associated organisms in the Quaternary.

Evolutionary forces maintaining the hybrid zone

Although it is not possible to distinguish between intrinsic and extrinsic mechanisms maintaining the narrow cline width at this stage, our results provide some insights that could be tested with further studies. Alternative hypotheses that account for narrow clines include the role of pre- or post-zygotic selection (i.e., due to sexual or ecological divergence), or a combination of both. For example, ENMs suggest that when the ecological niches of S. ruficapilla and S. cinerea are inferred separately, they remain geographically restricted relative to the model when they are inferred together. This may suggest a different range of environmental conditions for either species, which in turn could indicate that ecological divergence is a source of selection causing lower hybrid fitness. However, it is also possible that this difference in ENMs is a consequence of some limitation in sampling sites. Future analyses will allow to contrast these possibilities.

Final remarks

Here, we provide a spatiotemporal diversification scenario of S. ruficapilla and S. cinerea spinetails in the AF, as well as a portrait of evolutionary processes shaping the hybrid zone between them. Divergence between species was recovered in mtDNA and Z-linked loci and suggests that isolation and differentiation in Quaternary refugia was a main cause of diversification. The narrow cline of one Z-linked locus relative to other loci suggests that some form of selection may shape this hybrid zone. Detailed studies on vocal, ecological, and genomic variation across this hybrid zone are warranted to understand further the speciation process of these taxa.