Introduction

Every year, 10–40% of the global crop harvest is lost to plant pathogens1,2, many of which are spread through infected seeds or propagules (e.g., stem cuttings). Informal seed systems account for the majority of crop germplasm planted by smallholder farmers3. Exchanging “seeds” (here understood as propagules sensu lato) allows farmers to acquire new varieties, recover lost types or compensate for seed shortage4. By maintaining large portfolios of crop varieties, farmers accommodate different cultural needs or preferences and buffer the effects of unpredictable climatic or epidemiological shocks5. However, seed exchange networks can also make smallholder farming systems more vulnerable if they facilitate the spread of seedborne plant diseases.

Few studies have explored in detail the importance of seed exchanges on endemic propagation of plant pathogens through landrace populations. Understanding how social networks of seed exchanges influence the population dynamics of plant diseases is key to designing effective disease management programs, which increasingly rely upon community-based approaches to curb the spread of crop diseases6,7.

In smallholder farming communities, kinship systems play an important role in promoting seed exchanges between villages8,9,10,11,12. Kinship systems are cultural representations of relationships between individuals based on the notion of clan membership. By defining rules of descent and incest prohibitions, kinship systems structure matrimonial networks between communities and normalize social interactions between kin (related by descent) and affine (related by marriage).

In Gabon (Central Africa), marriages play an important role in regulating exchanges of cassava varieties between smallholder farmer communities11. Gabon is characterized by a strong cultural geographical contrast, with matrilineal societies occupying the southern side of the Ogooué River while patrilineal societies are predominant on the northern side (Fig. 1 and Table 1). In matrilineal societies, young women usually receive a gift of cassava cuttings from their mother when they marry (vertical transmission); the bride brings these cuttings to her husband’s village as part of her dowry, increasing the village’s varietal portfolio but increasing also the risk of importing infected germplasm into the community. In contrast, in patrilineal societies in northern Gabon farmers rely mostly or sometimes exclusively on “heirloom” landraces that the bride receives from her mother-in-law when she moves in with her husband (affinal transmission); by discouraging seed exchanges between villages, affinal transmission keeps cassava genetic diversity within the boundaries of the community but can also act as a barrier against the introduction of seedborne pathogens.

Fig. 1: Distribution of villages surveyed in Gabon in 2006–2007 and 2014–2015 (insert) and origin of cassava varieties in patrilineal and matrilineal villages (pie charts).
figure 1

ODJ was visited in 2004. MBG was visited in 2006 and 2015. Matrilineal and patrilineal villages are indicated by red and blue dots, respectively. The Ogooué River (dashed line) marks the demarcation between the patrilineal (north) and matrilineal (south) geographic domains, which corresponds also to the boundary between the A and B linguistic zones according to Guthrie’s classification of Bantu languages68. Vertical transmission (M: mother → daughter (red)) is predominant in matrilineal societies, while affinal transmission (HM: mother-in-law → daughter-in-law (blue)) is characteristic of patrilineal societies (see Table 1 and Supplementary Data 3 for details, including abbreviations).

Table 1 Seed exchange dynamics in the 11 communities surveyed.

Cassava diversity in Gabon exhibits a strong phylogeographic structure, with high varietal diversity in the south and low diversity in the north, resulting from the strong matrilineal/patrilineal geographic divide that has contributed to maintaining regional patterns of genetic diversity that mirror the geographic distribution of patrilineal and matrilineal societies11. Here, we investigate whether this southern-matrilineal/northern-patrilineal contrast also influences the spatial structure of viruses responsible for the cassava mosaic disease (CMD), a major pandemic that threatens regional food security in Africa.

CMD is caused by a complex of viruses of the genus Begomovirus (family Geminiviridae), seven of which are endemic to Africa13. Cassava mosaic geminiviruses (CMGs) are naturally transmitted by whiteflies (Bemisia tabaci Gennadius [Aleyrodidae: Hemiptera]), which play a key role in the epidemiology of CMD14, and through infected stem cuttings, which contribute to maintaining high prevalence of CMD15. Since the early 1990s, a severe form of CMD that originated in East Africa from a synergistic interaction between the African cassava mosaic virus (ACMV) and Uganda strain of East African cassava mosaic virus (EACMV-UG) has been steadily expanding towards Central and West Africa16. Today, CMD is considered one of the most damaging plant diseases in the world15. Cultivars resistant to CMD have been widely deployed in an effort to contain the pandemic but limited knowledge about the role of local seed systems in the circulation of germplasm infected with CMD and the rate of adoption of disease-resistant cultivars has been a major barrier to their success17.

With DNA substitution rates in the order of 10−3 to 10−5 substitutions/site/year, CMGs evolve at molecular rates comparable to that of RNA viruses18. Because of this rapid evolutionary rate, it is possible to investigate factors that impact the dynamics of CMGs transmission by analyzing the shape of viral phylogenies19,20.

ACMV is an excellent model for studying how seed exchange networks influence CMG diversity in cassava landrace populations. ACMV is omnipresent in sub-Saharan Africa where cassava is cultivated21. Unlike other CMG species, in which interspecific recombination and pseudo-recombination (the reassortment of heterologous genome components) is frequent22,23,24, the ACMV genome presents little evidence of recombination15 (although Tiendrébéogo et al.25 identified an ACMV-like recombinant in Burkina Faso), making it easier to infer movements of viral lineages between communities of farmers using phylodynamic methods.

Here, we extend the phylodynamic inference framework to include social factors that shape regional networks of seed exchange and demonstrate how social rules that control seed movements within and between farmer communities influence functional connectivity between local populations of ACMV in Gabon.

Results

Contrasting the population structure of ACMV and its host in Gabon

Viral DNA was recovered from dried cassava leaves collected between 2004 and 2015 across 11 villages chosen to represent contrasted situations in terms of social structure, ethnolinguistic diversity, accessibility, and degrees of insertion with local/regional markets (Supplementary Data 1). Diagnostic PCR revealed a high prevalence of CMGs, with an average of 80% plants infected by at least one CMG (Table 2). Out of 1132 plants tested, 52% were infected by ACMV only, 26% were co-infected by ACMV and EACMV, and <2% were infected by EACMV only. Multiple infections were most prevalent in Mbong-Ete, Nombedouma and Odjouma, where they represented >40% of samples tested positive for CMGs, and least prevalent in Mandilou and Cocobeach, where they represented <10%.

Table 2 Prevalence of CMGs in the 11 villages surveyed.

To analyze the phylogenetic structure of ACMV diversity, we built a maximum likelihood phylogenetic tree from a total of 392 viral sequences (346 unique haplotypes; Supplementary Fig. 1). While there was no evidence of a temporal signal (R2 = 0.082; Supplementary Fig. 2), Bayesian clustering revealed a strong phylogeographic structure with viral haplotypes grouping into two main regional clusters: (i) a southwestern clade, which was prevalent in matrilineal villages (Nombedouma [NBD], Douani [DUA], Mandilou [MAN], Makoula [MKA]) and one patrilineal village (Cocobeach [CCB]); and (ii) a northeastern clade, which was predominantly found in patrilineal villages (Mbong-Ete [MBG], Misele [MIS], Minvoul [MVL], Imbong [IMB], Mopia [MOP]) and one matrilineal village (Odjouma [ODJ]). BAPS software also identified a minor eastern clade (iii) restricted to villages alongside the eastern border with Congo (IMB, MOP, and ODJ) (Fig. 2A, B). Remarkably, the population structure of the virus was congruent with that of the host plant (Syrjala’s test26, Virus NE-Host NE: Ψ = 0.027, P = 0.421; Virus SW-Host SW: Ψ = 0.022, P = 0.069). BAPS identified four plant clusters: (i) a southern cluster, predominantly associated with matrilineal villages; (ii) a northern cluster, predominantly associated with patrilineal villages, and (iii) an eastern cluster, predominant in IMB, MOP and ODJ, with a fourth separate cluster mostly associated with CCB (Fig. 2C, D).

Fig. 2: Genetic structure in ACMV and cassava host populations in Gabon.
figure 2

a Bayesian clustering analysis of n = 346 ACMV viral sequences using BAPS. Matrilineal and patrilineal villages are indicated by red and blue dots, respectively. Pie charts show the distribution of viral haplotypes among the three regional clusters identified by BAPS. No evidence for admixture between groups was found. b Neighbor-joining tree based on Nei’s genetic distance69 and 1000 bootstrap resampling. Bootstrap values >70% are shown as pie charts. c Bayesian clustering analysis of n = 423 cassava host plant multilocus genotypes based on nuclear microsatellites (Supplementary Methods). No genotypic data were available for MIS and MBG2015 (empty pie charts). d Neighbor-joining tree based on Rogers’ distance70 and 1000 bootstrap resampling.

Comparing levels of phylogenetic diversity in viral communities across regional clusters

To estimate the diversity of local viral assemblages, we used phylogenetic clustering methods to measure the dispersion of viral isolates across the phylogenetic tree and identify clusters of viral sequences that share a common evolutionary history (“phylotypes”27). Analysis of the viral phylogeny revealed 29 phylotypes and 20 “singleton” haplotypes that were not assigned to any cluster (Supplementary Fig. 1 and Supplementary Data 2). At the village level, phylogenetic diversity (as measured by the effective number of phylotypes, 1D) was strongly correlated with cassava varietal diversity (average number of landraces per farmer: Pearson’s correlation coefficient, r = 0.76, P = 0.005; total number of landraces: Pearson’s r = 0.63, P = 0.027) (Supplementary Fig. 3). Viral diversity was also strongly correlated with temperature seasonality (Pearson’s r = 0.62, P = 0.032) but not with precipitation seasonality (Pearson’s r = 0.54, P = 0.068).

In matrilineal villages, viral haplotypes showed low genetic relatedness and high dispersion across the phylogenetic tree, while in patrilineal villages viral populations formed cohesive clusters characterized by low taxonomic diversity (i.e., viral haplotypes were genetically closely related). The most diverse assemblage was observed in Nombedouma (species richness, 0D = 15; effective diversity based on Shannon index28, 1D = 10.80), where 33% of viral isolates did not relate to any phylotype (singletons), and the least diverse in Mbong-Ete (0D = 5; 1D = 2.12), where 80% of isolates clustered within a single phylotype (P20) (Table 3). Differences in mean levels of phylogenetic diversity were significant between matrilineal and patrilineal villages for 0D (species richness; Pallmann–Scherer test29, lower-tailed, P = 0.006) and 1D (Shannon diversity, P = 0.037), but not for 2D (dominance index, P = 0.135), which emphasizes abundant types (Fig. 3). This indicates that singletons and rare phylotypes contribute the most to differences in viral diversity between villages. A comparison of diversity profiles shows that increasing sampling effort would have likely resulted in an increase in the total viral diversity detected in matrilineal villages, whereas in many patrilineal communities (in particular villages from the northern cluster [MBG, MIS, MVL]) a plateau was already reached (Supplementary Fig. 4). In patrilineal villages, sample coverage averaged 93% (SD = 3.7%) compared to 80% (SD = 8.3%) in matrilineal communities (68% in Nombedouma; Table 3).

Table 3 ACMV phylogenetic diversity in the 11 communities surveyed.
Fig. 3: Differences in mean levels of phylogenetic diversity between (n = 5) matrilineal villages (M, red) and (n = 7) patrilineal villages (P, blue).
figure 3

Villages from the eastern cluster with mixed kinship systems (IMB, MOP) were assimilated to patrilineal societies (details in Supplementary Data 1). The composition of viral assemblages is based on the distribution of viral haplotypes among the 29 phylotypes. a Diversity based on the number of phylotypes (0D). b Effective diversity based on Shannon entropy (1D). c Effective diversity based on Gini-Simpson index (2D). Box plots indicate the median (middle line), 25th and 75th percentile (box) and 5th and 95th percentile (whiskers) as well as outliers (open circles).

Temporal dynamics of viral diversity in patrilineal villages

Greater diversity in viral assemblages in matrilineal societies is consistent with an accumulation of genetically distinct ACMV variants resulting from repeated introductions of infected germplasm from different origins. In addition to cassava varieties they received from their parents, farmers often solicit also cuttings from relatives, friends or neighbors. Such horizontal exchanges (farmer-to-peer) are comparatively less common in patrilineal societies where farmers, unless single, divorced or widowed, rely mostly on the varieties gifted to them by their mother-in-law (Supplementary Data 3).

By encouraging exchanges of cassava varieties with other communities, farmers in matrilineal villages also encourage gene flow between distinct ACMV subpopulations, resulting in local “hotspots” of viral diversity. Conversely, the stronger clustering of viral haplotypes in patrilineal villages is congruent with a limited inflow of new ACMV variants due to strong sociocultural barriers that discourage seed exchanges between communities. To study the dynamics of ACMV diversity in a patrilineal village, a temporal statistical parsimony network was built to depict genealogical relationships between viral sequences sampled in 2006 and 2015 in Mbong-Ete (Fig. 4). Most sequences from 2006 were derived from a single haplotype (MBG37), displaying a characteristic star-like pattern that suggests a rapid expansion from a single founder haplotype that spread through local populations of cassava landraces. The probable role of whiteflies in the local amplification of the infection is apparent from the lack of association between viral clades and the varietal identity of the host plants. MBG37 was sampled again in 2015 along with several closely related sequences but also many divergent haplotypes, the majority of which were sampled in cassava varieties recently introduced in the village (Fig. 4). Our data suggest a continuity of local infection dynamics centered on one main founder haplotype, with 50% of isolates from 2015 falling within the same phylotype (P20) as 80% of isolates from 2006 (Supplementary Data 2). A similar pattern was observed in Misele, ~5 km southwest of Mbong-Ete, where farmers grow a similar set of varieties. Many viral sequences in Misele were closely related to MBG37 (Supplementary Fig. 5), though this particular haplotype was not sampled in this village.

Fig. 4: Temporal statistical parsimony network showing the evolution of ACMV genetic diversity in MBG between 2006 and 2015.
figure 4

Each circle represents a distinct viral haplotype, where circle size is proportional to haplotype frequency. Genetic divergence is expressed as the number of mutational steps (black dots) between haplotypes. Small empty circles represent haplotypes that were not sampled in the corresponding time layer. The varietal identity of host plants is indicated with different colors. In both time layers, networks are centered on a single shared haplotype (MBG37, circled in red).

In patrilineal villages, where affinal transmission prevails (mother-in-law → daughter-in-law), viral diversity evolves primarily through the build-up of mutations from local founders maintained in the population as the same landraces are passed down generations, resulting in higher genetic relatedness among viral haplotypes. This was particularly well illustrated in Mbong-Ete, where 75% of the farmers interviewed had received their manioc cuttings from their mother-in-law. Mbong-Ete was revisited in 2015 and 20 farmers were interviewed, including 11 farmers whose farms were surveyed in 2006. While cultural attachment to heirloom landraces remains very strong, field surveys showed that varietal diversity almost quintupled at the village level in the nine-year interval (from 3 to 14 cassava varieties). However, half of the new varieties were private (grown by only one farmer), and on average farmers’ portfolio of cassava varieties increased from three to four varieties only (Table 1), with one variety (“Attends-Demain”, imported from Cameroon) becoming very popular among farmers.

Discussion

Host population structure is a major factor affecting virus metapopulation dynamics30. The striking similarities between the spatial distribution of viral clades of ACMV and cassava genetic clusters in Gabon suggest that the spread of the virus is constrained by factors that shape cassava diversity at the landscape level.

Many variables can influence functional connectivity between viral populations, including local abundance of whiteflies and accessibility of farmer communities (the ease at which a village can be reached). Dramatic increase in whitefly population density has been shown to drive the epidemic front of severe CMD pandemics across East and Central Africa31. Whiteflies have been reported in Cameroon, Equatorial Guinea, Republic of Congo, and Central African Republic. They have also been observed in southeast Gabon in 2003–200432 and more recently in 2014–2015 in northern Gabon (M. Delêtre, pers. obs.), but besides anecdotal evidence data on densities of whitefly populations are not available for Gabon. Seasonal variations of temperature and rainfall greatly influence cassava growth and have been associated with changes in the abundance of whiteflies and in the incidence of ACMV33. In Gabon, both temperature and precipitation seasonality show greater variability in the southern part of the country than in the north (Supplementary Fig. 7). There was also a positive correlation between viral diversity and temperature seasonality (BIO4). The possibility that environmental factors could also influence the spatial distribution of ACMV lineages cannot be ruled out. Temperature is a key determinant of whiteflies’ development and activity, and seasonal variability plays an important role in the dynamics of vector-borne transmission of cassava geminiviruses33,34. It seems unlikely, however, that environmental factors alone would explain differences in viral diversity between matrilineal and patrilineal societies, in particular, greater genetic relatedness within viral populations in the latter. Viral diversity was highly positively correlated with host diversity, while host diversity was not correlated with environmental factors.

At the village level, cutting-borne transmission plays a prevailing role in spreading the disease and several studies have shown that CMG infection is primarily sustained by the regular use of infected cuttings for cassava propagation14,31,35,36. Geographic isolation has important implications for farmers’ access to seeds and exposure to plant pathogens. With better transportation links, villages may be more exposed to movements of infected germplasm compared to more secluded communities that rely mostly on a smaller set of local varieties. However, local patterns of viral diversity were not concordant with either the size or geographic accessibility of villages. Nombedouma, where we recorded the highest viral diversity (Table 3), is a small community located on the shores of Lake Onangué and accessible only by boat from Lambaréné or Port-Gentil (Supplementary Data 1). Although anecdotal, the presence of EACMV-UG in the village as early as 2006 (Table 2) suggests that the UG strain might have been introduced with infected cassava cuttings imported from other villages, possibly favored by farmers’ open attitude to exchanging cuttings and continually importing new cassava varieties to test on their farms37. A diachronic comparison of the levels of cassava diversity between the 1960s and 2010s showed that varietal diversity in Nombedouma increased from 17 to >50 landraces in 40 years, with many varieties recorded in 1966 still being grown in 200637. In contrast, Mbong-Ete, where ACMV diversity was the lowest (Table 3) is located along the N2 road, a major axis for import/export with Cameroon and one of the most economically important roads in Gabon. Located within a triangle formed by Bitam in Gabon, Ambam in Cameroon, and Ebebeyin in Equatorial Guinea, the region is colloquially known as “Trois Frontières”, where cultural homogeneity has favored the development of a decentralized economic area with permeable borders to facilitate movements of people between the three countries and promote cross-border trade, notably through the creation of international markets in Abang Minko’o, Kyé-Ossi and “Mondial”38. Despite active trade in the region, cassava varietal diversity has been historically low in northern Gabon, with little renewal or increase over the past 100 years11,37, but the recent introduction of new cassava varieties indicates that social constraints can be relaxed to adapt to new threats such as emerging plant diseases.

When asked what triggered them to solicit cassava cuttings from outside their village, farmers replied it was the increasing incidence in their fields of a severe form of CMD, to which “Adzoro”, one of the three staple cassava varieties in the region, seems particularly susceptible. Mixed infection is an important feature of the severe form of CMD16,23. Between 2006 and 2015, the rate of mixed ACMV/EACMV infections in MBG increased from 25% to 43%, and EACMV-UG, which was not detected in 2006, was found in 7% of samples in 2015 (Table 2). EACMV-UG was first reported in Gabon in 200332, and while the virus was initially confined to the eastern part of the country, data suggest that the virus spread rapidly westwards. In 2006–2007, EACMV-UG was detected as far as Lambaréné, the westernmost record of the variant at the time, but was not found in Mbong-Ete (Table 2). EACMV-UG was still absent from areas bordering northern Gabon when the virus was first reported in Cameroon in 2010 near the border with the Central African Republic39. In 2015, however, EACMV-UG was detected in all four villages surveyed, with the highest prevalence (8%) in villages bordering Cameroon (MGB, MIS, MVL) and the lowest (2%) in Cocobeach (CCB), near the border with Equatorial Guinea where the virus had just been reported40.

While open seed exchange between matrilineal villages may have facilitated the spread of EACMV-UG across southern Gabon, prevalent endogamy (i.e., preferential marriage within the same cultural area) and the predominance of affinal transmission in patrilineal societies could have, in contrast, contributed to the delay of the arrival of EACMV-UG in northern Gabon by limiting exchanges with other communities. A nonmetric multidimensional scaling (NMDS) analysis of the phylogenetic composition of viral assemblages revealed that while local ACMV populations were generally distinct, irrespective of the geographic distance separating villages where they were collected, in northern Gabon viral populations almost overlapped (Supplementary Fig. 6). Only Cocobeach stood apart, mirroring patterns observed in the host plant.

Despite having slightly more diverse viral assemblages than Mbong-Ete and Misele, ACMV diversity was low in Cocobeach and Minvoul relative to the size and regional influence of the two communities. Whereas Mbong-Ete and Misele are small villages, Cocobeach and Minvoul are both urban clusters (>2000 and >4000 inhabitants, respectively) that play key administrative roles in controlling the flow of people and goods across the borders with Equatorial Guinea and Cameroon. Despite active cross-border trade exchanges that could encourage the circulation of viral variants, however, ACMV diversity was low in Cocobeach and Minvoul compared to matrilineal villages, where even small communities showed higher levels of phylogenetic diversity. The lack of correlation between accessibility and viral diversity is further evidence that kinship systems and rules that control seed movements within and between farmer communities influence local landrace populations’ exposure to new ACMV variants. It highlights, in particular, the role of affinal transmission in limiting the inflow of plant pathogens in patrilineal villages.

The structure and dynamics of plant virus diversity are shaped by the modes of transmission of the disease, the host’s level of resistance or tolerance to the virus, and the ecology, population structure, and genetic diversity of the host plant30,41,42. Our results suggest that plant virus ecosystems have also a cultural component, and that social factors—in particular kinship systems and social networks of seed exchange—also influence the spatial structure of plant pathogens.

In this study we focused on the intraspecific diversity of viral communities as a proof of concept, using ACMV as a model. Although a similar approach may be difficult to apply to other CMGs, the ubiquity of ACMV in Africa makes it an interesting proxy for monitoring viral movements between communities of farmers. Ultimately, it could be used to anticipate the geographic spread of emerging diseases such as the cassava brown streak disease (CBSD), another devastating disease43 whose transmission over long distances is primarily borne by infected propagules44. Originally confined to East Africa, the CBSD pandemic has been expanding rapidly since 2004, causing repeated crop failures and severe food shortages across East Africa43. An expansion of the CBSD epidemics to Central and West Africa would have dramatic economic and socio-political consequences for Africa (14,43). In Tanzania, efforts to control the CBSD outbreak through community phytosanitation, which is focused on replacing local landraces with virus-free clones in areas severely affected by the disease6,7, were partly compromised after farmers re-introduced heirloom varieties using infected propagules obtained through informal seed exchange networks6.

Community-based approaches focused on promoting virus-free clones from local landraces can be an effective strategy to offset the detrimental effects of the virus build-up in clonally propagated landrace populations45, but for disease management programs to benefit local development in the long term, we need to recognize the important social role played by seed exchanges in smallholder farming communities46,47. Formally testing the role of seed exchange networks in the epidemiology of CMD will require additional data collection and developing spatially explicit phylodynamic models to disentangle the effects of social factors from other environmental parameters. However, we believe that the notion of social epidemiology, which investigates the influence of social factors on the distribution of diseases in human populations, should be extended to seedborne pathogens in cultivated plants, whose transmission depends also on cultural factors that govern social interactions between farmer communities and determine connectivity between populations of the host plant.

Methods

Field surveys, sampling and molecular characterization of CMGs

Plant material collected in Gabon between 2004 and 2007 and already characterized for host plant genetic diversity11 was reanalyzed for the presence of cassava mosaic geminiviruses using diagnostic PCR. One village (MBG) was revisited and three additional villages were surveyed in 2014 and 2015 to study the evolution of cassava diversity and CMD prevalence in northern Gabon in the nine-year interval (Supplementary Methods).

In each community, plants were selected haphazardly with at least one sample × variety−1 × farmer−1 in order to maximize the number of farmers and landraces comprised in the sample, taking care that no farmer or landrace was over-represented (Supplementary Data 2). Representativeness of local datasets was assessed by computing Gini coefficients relative to farmers (GiF) and landraces (GiL) in R 3.5.148 using the package ineq49 (Supplementary Fig. 8). Gini coefficients measure inequality among values of frequency distribution and range between 0 (complete equality) and 1 (maximal inequality).

Total DNA was extracted from 20 mg of dried leaves using DNeasy® Plant Mini kits (Qiagen®). In each village, ~100 plants (1132 in total) were screened for the presence of ACMV, EACMV and East African cassava mosaic Cameroon virus (EACMCV) in single and mixed infection using a multiplex-PCR assay50. Samples for which PCRs revealed single infection by ACMV were selected for sequencing. Samples with mixed infection were additionally screened for the presence of EACMV-UG using specific primers22.

To characterize ACMV diversity, a set of degenerate primers was used that amplifies ~528 bp of the replication-associated protein (Rep) open reading frame (ORF) AC136. Details protocols for all PCR assays are provided in Supplementary Data 4. Amplification was checked on 1% TAE agarose gel stained with SybrSafe (Invitrogen). Positive PCRs were sent for direct sequencing in both directions without cloning (Macrogen Inc., Korea). Sequences were aligned and edited using CodonCode Aligner 7.0 (Codoncode Corporation, Dedham, MA, USA). After trimming short and low-quality sequences, a total of 404 sequences (484 bp) were BLASTed against GenBank database to confirm sequence identity. All but 12 sequences showed 95.7–99.6% nucleotide identity with ACMV, while the other 12 sequences showed 97.7–99.8% similarity with EACMV viruses (Supplementary Data 2). Sequences were aligned using MUSCLE 3.8.3151 and alignments were refined manually. Tests performed with RDP452 showed no evidence of recombination.

Statistical analyses

Phylogenetic and clustering analysis

FastTree 2.1.953 was used to build a maximum-likelihood (ML) tree from the 2006–2007 and 2014–2015 datasets combined under the GTR + CAT model. Branch support was estimated using nonparametric approximate likelihood-ratio tests (aLRT SH-like)54. TempEst 1.555 was used to test for the presence of a temporal signal in the molecular phylogeny by performing regression analyses of genetic divergence between viral sequences against sampling dates.

To evaluate the diversity of viral communities, Cluster Picker 1.356 was used to measure the dispersion of haplotypes across the tree and identify clusters of viral sequences that share a common evolutionary history (“phylotypes”27, subsequently treated as operational taxonomic units [OTUs] in diversity analyses). Clusters were assigned with branch support (aLRT) >70% and maximum pairwise genetic distance between taxa ≤4.5% to minimize the number of singletons, i.e., DNA sequences not assigned to any cluster. Viral diversity was then evaluated using three measures of Hill numbers qD with q = 0 (species richness), q = 1 (exponential of Shannon index) and q = 2 (inverse Simpson index). Sample-size- and coverage-based rarefaction and extrapolation curves57 were generated using the R package iNEXT58. Extrapolated data was calculated up to a base sample double the size of the smallest reference sample and 95% confidence intervals were derived from 100 bootstrap replicates.

To compare diversity measures between villages clustered by kinship, the mcpHill function implemented in the R package simboot59 was used to perform Tukey-like contrast tests based on resampling (1000 iterations). Unlike ANOVA, which assumes normality and homoscedasticity, this method does not make any distributional assumptions but accounts for correlations among variables and the distributional characteristics of the data. Following Pallman et al.29, p-values were adjusted for multiple comparisons across groups and across diversity measures for integral Hill numbers of orders 0 ≤ q ≤ 2. To compare the composition of viral assemblages across regional clusters, the R package vegan60 was used to perform a nonmetric multidimensional scaling analysis (NMDS) using Bray–Curtis dissimilarities.

At the village level, statistical parsimony networks were constructed using the R script TempNet61 to analyze topological properties of local networks of viral haplotypes. Although less accurate than character-based approaches, distance-based methods are useful to study intraspecific genetic variation in small viral populations, in which ancestral haplotypes often coexist along with their descendants, resulting in polytomies which violate assumptions of phylogenetic reconstruction methods62. Networks based on genetic similarity between viral isolates are an intuitive approach to derive information on the local dynamics of seedborne pathogens. Relationships among viral haplotypes can be represented as undirected networks in which nodes (haplotypes) are connected by edges whose length corresponds to the shortest genetic distance between nodes.

The spatially explicit Bayesian clustering method for DNA sequence data implemented in BAPS 6.063,64 was used to explore the population genetic structure of ACMV in Gabon. For the genetic mixture analysis, five independent runs were performed using the “spatial clustering of individuals” option with an upper limit of 15 for the maximum number of clusters (K). Population admixture analysis was performed using the “admixture based on mixture clustering” module, with a minimum population size of two, 100 iterations to calculate the admixture coefficient for individuals and 200 reference individuals from each population. Twenty iterations were used to calculate the admixture coefficient for the reference individuals. For comparison, BAPS was also used to explore patterns of genetic diversity in cassava host plants using multilocus genotypic data from six nuclear microsatellite markers (Supplementary Methods). Syrjala’s nonparametric distributional test26 was used to test for spatial congruence between host and virus spatial clusters, as implemented in the R package ecespa65.

To test whether ACMV population structure is influenced by environmental factors, elevation data and climatic rasters for temperature seasonality [BIO4 (standard deviation × 100)], total annual rainfall [BIO12] and precipitation seasonality [BIO15 (coefficient of variation)] at 30 s resolution (~1 km2) were obtained from WorldClim v2.166. Vegetation cover data was obtained from Mayaux et al.67. The effect of environmental parameters on the prevalence and diversity of ACMV was tested using Pearson’s correlation coefficient r.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.