Extensive recombination challenges the utility of Sugarcane mosaic virus phylogeny and strain typing

Braidwood, Luke; Müller, Sebastian Y.; Baulcombe, David

doi:10.1038/s41598-019-56227-y

Download PDF

Article
Open access
Published: 27 December 2019

Extensive recombination challenges the utility of Sugarcane mosaic virus phylogeny and strain typing

Scientific Reports volume 9, Article number: 20067 (2019) Cite this article

2289 Accesses
13 Citations
Metrics details

Subjects

Abstract

Sugarcane mosaic virus (SCMV) is distributed worldwide and infects three major crops: sugarcane, maize, and sorghum. The impact of SCMV is increased by its interaction with Maize chlorotic mottle virus which causes the synergistic maize disease maize lethal necrosis. Here, we characterised maize lethal necrosis-infected maize from multiple sites in East Africa, and found that SCMV was present in all thirty samples. This distribution pattern indicates that SCMV is a major partner virus in the East African maize lethal necrosis outbreak. Consistent with previous studies, our SCMV isolates were highly variable with several statistically supported recombination hot- and cold-spots across the SCMV genome. The recombination events generate conflicting phylogenetic signals from different fragments of the SCMV genome, so it is not appropriate to group SCMV genomes by simple similarity.

Comparative genomics reveals insights into genetic variability and molecular evolution among sugarcane yellow leaf virus populations

Article Open access 30 March 2021

Population genomics of Puccinia graminis f.sp. tritici highlights the role of admixture in the origin of virulent wheat rust races

Article Open access 21 October 2022

Evolutionary study of maize dwarf mosaic virus using nearly complete genome sequences acquired by next-generation sequencing

Article Open access 22 September 2021

Introduction

Sugarcane mosaic virus (SCMV) is a positive-sense single-stranded RNA virus in the Potyviridae family (genus potyvirus), the largest and most economically damaging family of plant viruses¹. SCMV can infect three major crops: sorghum, sugarcane (10–35% yield loss), and maize (20–50% yield loss), and is thought to be one of the top-ten most economically damaging plant viruses^2,3,4. It has been reported in 84 countries across the 6 inhabited continents and this cosmopolitan distribution is likely due to worldwide trade in its host crops for hundreds of years (Fig. 1)⁵. Most potyviruses, including SCMV, are spread non-persistently by aphid species⁶ but SCMV can also spread via movement of infected root cane (sugarcane) and through maize seeds and pollen^7,8.

The potyvirus genus is notable for its size (>150 species) and extensive involvement in synergistic plant viral conditions. Typically, potyviruses enhance the titre of the partner virus in synergistic interactions through a process that is dependent on the multifunctional helper-component protease (HC-Pro)^9,10. Synergism between potyviruses, including SCMV, and Maize chlorotic mottle virus (MCMV) causes maize lethal necrosis (MLN) that can cause total yield loss¹¹. SCMV threatens both food security and economic development because maize and sorghum are vital staple foods, while sugarcane is an important cash crop. Despite SCMV being present in East Africa and China for decades, its impact in both regions has been enhanced by the recent arrival of MCMV, and therefore MLN^{12,13,14,15,16}. Increased understanding of the variability and evolution of SCMV in these regions may inform future disease control measures.

SCMV has a typical potyvirus genome: a roughly 9.5 kb monopartite positive-sense single-stranded RNA molecule (Fig. 1b) which is packaged into around 2,000 helically arranged coat protein (CP) monomers to form flexuous virions 750 nm long and 13 nm wide. The 5′ end of the genome is capped by the 25 kDa Vpg protein, and the 3′ end is poly-adenylated. Translation of the genome produces a single polyprotein which is cleaved by three viral-encoded proteases to generate ten multifunctional proteins^1,17. An additional protein, P3N-PIPO, is generated due to transcriptional slippage in the P3 gene at a conserved GAAAAAA motif during genome replication^18,19.

Potyviridae evolution features extensive intra-specific recombination^20,21, which likely occurs when the viral RNA-dependent RNA polymerase (RdRP) switches between viral genome templates²² during virus replication. Reported recombination hot-spots are in the P1 region of Turnip mosaic virus and in the CI-NIa-protease region of several species (Fig. 1)^{23,24,25,26,27,28,29,30,31,32,33,34}. Predicted recombination breakpoints in the SCMV genome are in CI, NIb, NIa-VPg, and NIa-Pro, and the 6K1-VPg-NIa-Pro-NIb region has been called a recombination hot-spot, although without statistical support^{28,29,30,31,32,33,34}. Recombination complicates phylogenetic analyses because various genome regions in a single individual may have different evolutionary histories. Accordingly, constructing phylogenies using different sections of SCMV and other potyviral genomes produces conflicting trees^35,36. Recombination may also impede virus detection because increased genomic variation may lead to false negative results with common techniques such as PCR and antibody ELISA^11,37.

There are multiple potyviruses present in East Africa which could act as partner viruses to MCMV. Therefore, we decided to survey MLN-infected maize in Kenya and Ethiopia using next-generation sequencing (NGS) to allow identification and analysis of the partner viruses in this region. The only partner virus we detected was SCMV, and these data were then used to look for signals of historical recombination in the SCMV genome. We also assessed the suitability of traditional phylogenetic methods for SCMV genomic data.

Results and Discussion

Sequencing of MLN-infected maize reveals SCMV

In August 2014 we collected 23 MLN-symptomatic (mosaic and chlorosis on leaves) maize samples from 13 Kenyan and 4 Ethiopian sites (Table S1) and performed NGS RNA-seq (ArrayExpress accession: E-MTAB-7002). All samples contained both MCMV, characterised previously³⁸, and SCMV (Fig. S1). The 23 assembled SCMV sequences ranged from 2,191 bp to 9,632 bp, which is 23% to 100% of the longest previously reported SCMV sequence (available in GenBank: MH093717-MH093739). We aligned and manually trimmed the long (>8000 bp) SCMV contigs to the full SCMV genomes available in GenBank for further analysis.

There were small insertions in the 5′ and 3′ untranslated regions (UTRs) in one (JX047421.1) and three (JX185303.1, EU091075.1, GU474635.1) isolates respectively but most insertion/deletion variation was over a 200 bp region of the CP coding sequence (Figs. 2a and S2, S3). CP indels were not distributed according to the geographic location of isolates. A 15 bp insertion, for example, was present in isolates from Ethiopia (1), Kenya (9), Rwanda (2), USA (1), and Mexico (1) whereas a 3 bp insertion at the same locus is present in Kenya (2), China (10), and Ecuador (1).

We assessed nucleotide polymorphism diversity of multiple sequence alignments using DNAsp5³⁹. Diversity was high with 4,289 mutations spread over 2,831 sites and an average of 1121.1 nucleotide differences between sequences. Nucleotide diversity across the genome was 0.17, higher than for most RNA viruses but within the range previously reported for SCMV^30,33. There were high polymorphism regions in the N-termini of P1 and CP and the most conserved regions were in the central domain of P3 and the 3′ UTR (Fig. 2b,c). P1 is a serine protease with a known hyper-variable region⁴⁰. The variable region in the CP N-terminus (Figs. 2b,c and S2, S3) corresponds to a domain with variable amino acid length and low conservation^41,42, with episodic positive selection detected by MEME analysis (Fig. S4)^43,44. This N-terminal domain is surface located, raising the possibility that variation in this region may alter interactions with host or vector proteins^45,46.

The conserved P3 protein is essential for potyviral replication but it is also the locus of the cryptic fusion protein P3N-PIPO and this overlapping open reading frame is an extra constraint to evolution of the nucleotide sequence⁴⁷. Interestingly, in the P3N-PIPO region, there were also sites with episodic positive selection detected by MEME (Fig. S4). The 3′ UTR of potyviruses contains a poly-A tail to promote genome stability and translation which is completely conserved.

Evidence for SCMV recombination

The alignment patterns of several samples suggested recombination, with different regions of the same sample genome showing closest alignment to divergent reference genomes (Fig. 3a). To simplify the analysis whilst retaining maximum diversity, we subsampled the alignment of 116 sequences. We generated a nucleotide identity matrix (Table S2) and grouped sequences with >99% similarity, then kept the longest sequence in each group. This produced a final dataset of 55 SCMV genomes/contigs, including 13 from our NGS libraries.

Splits network analysis can detect and visualize conflicting signals from a phylogenetic dataset⁴⁸. Conflicting signals imply that the relationship between sequences is different depending on the part of the sequence being analysed, and they can be caused by recombination or horizontal gene transfer. Splits networks with reticulate shapes rather than bifurcating tree shapes indicate conflicting phylogenetic signals. Here, we found the splits network derived from our SCMV sequences to be very different from a bifurcating tree indicating conflicting phylogenetic signals and implying recombination (Fig. 3b). Additional independent evidence for SCMV recombination comes from the distribution of multiple indels that do not correlate with geographic or phylogenetic proximity (Figs. 2a and S2, S3).

To estimate the number and location of recombination breakpoints we used Recombination Detection Programme 4 (RDP4) to predict the locations of recombination events in SCMV⁴⁹. RDP uses multiple algorithms to locate sites in an alignment at which phylogenetic signals change rapidly, which is indicative of a recombination event. The recombination scheme suggested multiple recombination events, with many between geographic regions (Fig. 3c,d). There was notable reciprocal exchange of recombinant fragments between European and Chinese isolates (Fig. 3c,d), and between Chinese and African isolates. There was also evidence for intra-region recombination in the regions with more than five isolates (China and Africa). Recombination was more frequent between strains within a region than between strains in different region, as expected. Additionally, there were 28 recombination events with unknown parents (i.e. genome fragments not closely related to any known isolates), demonstrating that more sequencing data will be required to fully describe worldwide recombination patterns.

To search for regions of the genome with an over- or under-representation of recombination, we counted the recombination breakpoints in sliding windows across the SCMV genome (Fig. 4a), calculated likelihood ratios, and used permutation testing to identify statistically significant regions (Fig. 4b). The permutation test randomly places recombinant fragments spanning the same number of variable nucleotide positions as each detected recombinant fragment, which controls for sequence variability and generates a density map of where recombination is more likely to be detected in the SCMV alignment.

‘Global hot/cold-spots’ were defined as those with more breakpoints than 95% of the sliding windows across the genome, and ‘local hot/cold-spots’ were those with more breakpoints than 99% of sliding windows at that position. Global recombination hot-spots were present at the 5′ and 3′ genomic termini. We detected nine local recombination hot-spots, and twelve local recombination cold-spots (Fig. 4). The hot-spots were concentrated in the 3′ region encoding NIb and CP, with single hot-spots in CI, P3N-PIPO, and the P1/HC-Pro junction (Fig. 4c). The cold-spots were distributed more uniformly across the first 7500 bp of the genome. These are the first statistically supported recombination hot- and cold-spots reported in SCMV.

Recombination can promote nucleotide diversity by mixing lineages, and we note that the 3′ genomic region encoding CP in SCMV has high nucleotide diversity (Fig. 2), and a concentration of recombination hot-spots (Fig. 4). Potyvirus recombination hot-spots have previously been observed in the C-terminal region of CI, which we also observed, and in P1, which we did not^{23,24,25,26,27}. Recombination is clearly a major force in SCMV evolution (Fig. 3), as in the Potyviridae generally^20,50.

Is making an SCMV phylogeny useful?

The purpose of a phylogeny is to describe the evolutionary history of biological entities. This exercise has academic value, in tracing the history of life, and practical value, in organising similar biological entities into clades. If a phylogeny does not describe evolutionary history, and does not group biological entities into self-similar clades, it is an inappropriate analysis (due to the methodology or the underlying data) and does not contain useful information. Given the high levels of recombination between our SCMV sequences, we decided to investigate whether further phylogenetic analysis is appropriate.

There are many published SCMV phylogenies, based on CP sequences and whole genomes, which place isolates into two to six strains with variable names, see Gao et al. (2011) for a helpful summary⁵¹. Whole genome phylogenies of SCMV group isolates into four strains (IA, IB, II, III, IV), with around 80% nucleotide similarity between strains^33,51. African SCMV genomes sequenced in this study form two novel clusters (AI and AII) of sequence identity, decreasing the separation between previously reported strains (Fig. 3b).

Simulations show that phylogenetic analyses are most severely impacted by recombination when breakpoints occur near the centre of alignments, and by recent recombination between diverged taxa⁵². Our recombination analysis shows evidence of recombination between divergent (<80% nucleotide identity) SCMV isolates, in the centre of both genomes and CP sequences (Figs. 3 and S5, Table S3). Therefore, there is no single evolutionary history for phylogenetic analyses to infer. To statistically test for conflicting phylogenetic signals, we constructed phylogenies by maximum likelihood using the whole SCMV genome, and three sections of the alignment (section 1: positions 963–2,764 in the original alignment, section 2: 2,875–5,103, and section 3: 5,181–8,036) without recombination hot-spots (Fig. 4, Supplemental Data d1). We chose alignment sections with a minimum number of recombination events (i.e. containing cold-spots) which were separated with recombination hot-spots (Fig. 4c). Tree incongruence was tested statistically using a Shimodaira-Hasegawa test (SH-test) for each pair of trees. The log likelihood differences were 12,222 between sections 1 and 2, 12,739 between 1 and 3, and 15,692 between 2 and 3 (p < 1e-7 and n = 2 trees for all comparisons), confirming significant differences between the trees, which can be visualised using tanglegrams or identity matrices (Fig. S6).

Conclusion

Viral studies often present a phylogeny followed by evidence of extensive recombination, showing that the central assumption of the phylogenetic analysis was violated^23,30,31,33. Multiple evolutionary histories within a genome are valid and averaging these different histories does not produce the true evolutionary history of the genome⁵². Imposing a bifurcating tree structure on a dataset which does not have a single, bifurcating evolutionary history will introduce systematic error. We argue that in organisms with unknown or high recombination rates, such as RNA viruses, recombination analyses should be performed initially, then used to inform the phylogenetic approach taken, as in Ohshima et al.²⁵. Splits network analysis is appropriate for all alignments, but standard phylogenetic methods may not be, depending on the splits network results. Phylogenetic analyses of whole genomes may be desirable to identify viral strains containing isolates with a broadly similar evolutionary history. However, the presence of sequences from different strains which have entered due to recombination may confound phenotyping and molecular attempts at strain identification in the field.

We have shown that SCMV is in complex with MCMV in MLN-infected maize in East Africa, and that producing SCMV phylogenies does not produce useful classification systems or describe biological truth (Fig. S6)⁵². We conclude that constructing phylogenetic trees is inappropriate for SCMV due to extensive historical recombination between divergent isolates. This may also have implications for studies of other RNA viruses, and phylogenies of other organisms with high recombination rates. There are multiple avenues for progress in this field - for example the statistical framework for assessing splits networks is not well developed, there are no automated approaches for locating viral recombination hot-spots, and visualisation of reticulate recombination networks.

Methods

NGS of MLN-infected maize

We collected maize leaf samples from Kenya and Ethiopia in August 2014 (Table S1), storing samples in RNA-later (Ambion) on dry ice. To extract RNA, we used Trizol (Ambion) according to manufacturer’s instructions. We depleted ribosomal RNA with the Ribo-Zero Magnetic Kit (Plant Leaf - Epicentre). To generate indexed stranded libraries, we used Scriptseq V2 RNA-Seq Library Preparation kits and Scriptseq Index PCR primers (Epicentre). Library concentration and quality were confirmed using Qubit (Life Technologies) and a Bioanalyzer High Sensitivity DNA Chip (Agilent Technologies). Beijing Genomics Institute performed 100 bp paired-end sequencing on one lane of a HiSeq 2000 (Illumina).

NGS quality control

We used a custom python script to demultiplex the libraries allowing one error in index sequences, then trimmed adaptors using Trim galore! (parameters:–phred64–fastQC–illumina–length 30–paired_retain_unpaired input_1.fq input_2.fq)^53,54. String matching deduplication (deletion of identical reads) was performed using Quality Assessment of Short Read (QUASR) pipeline scripts⁵⁵.

SCMV consensus sequence generation

To generate SCMV genome sequences, we aligned libraries to a bowtie2 reference containing all SCMV genomes available in NCBI GenBank in March 2016 (parameters: -D 20 -R 2 -N 1 -L 20 -i S,0,2.50–phred64–maxins 1000–fr)⁵⁶. Next, we extracted SCMV-aligning reads and performed de novo assembly using Trinity (v2.0.2), extracted contigs above 2 kb in length, then inspected and curated (if necessary) SCMV contigs⁵⁷. To generate SCMV consensus sequences, we aligned each library to its respective Trinity contig using bowtie2, generated pileups using samtools, and called sequences using the QUASR script pileup_consensus.py, with a threshold of zero or ten % of reads for the calling of ambiguity codes (parameters: -ambiguity 0–10 -dependent -cutoff 25 -lowcoverage 20)⁵⁸.

SCMV alignment and diversity analysis

SCMV genomes generated in this study were combined with those in GenBank and aligned using MUSCLE (gap extension cost: 800, other settings default) in MEGA6, with separate alignments for genomes called with and without ambiguous bases⁵⁹. We checked the alignments manually in JALview and refined where necessary⁶⁰. To construct a nucleotide identity matrix we used the dist.alignment function from the R package seqinr. We obtained diversity metrics using the alignment without ambiguous base calls in DnaSP v5³⁹.

SCMV recombination analysis

Recombination analysis was performed with the alignment containing no ambiguous base calls. To generate splits networks, we used SplitsTree4 using default settings - distances calculated by uncorrected P, and network generated by neighbour-net⁴⁸. To generate more specific predictions of recombination, we used RDP4, using the algorithms RDP, GENECONV, MaxChi, BootScan, and SiScan (all default settings), and reviewed all breakpoints manually. Recombination network diagrams were generated by constructing interaction matrices for regions and SCMV isolates. The interaction matrix was converted into a regional recombination network (Fig. 3c) using the ggraph function from the ggraph R package, while the individual interaction networks (Fig. 3d) were constructed using the ggnet2 function of the R package GGally.

Dendrogram and phylogeny construction

The nucleotide identity dendrogram was constructed from the identity matrix using the heatmap.2 function of the gplots R package. Phylogenies were constructed from the alignment without ambiguous bases using RAxML-HPC2 (8.2.10) on XSEDE, hosted by the CIPRES science gateway (parameters: -T 4 -N autoMRE -n result -s infile.txt -m GTRCAT -q part.txt -c 25 -p 12345 -f a -x 12345–asc-corr lewis)⁶¹. Phylogenies were compared using the tanglegram function of Dendroscope (3.5.9)⁶².

Statistical analysis

We used an SH-test to test for tree incongruence⁶³ between phylogenies constructed using the three alignment sections. Trees for each section were generated using the methods above and compared using the SH-test as implemented in the R-package phangorn⁶⁴ in a pairwise fashion. To determine whether the SH-test is appropriate for these data we created a negative control dataset from our alignment in which we did not expect tree disagreement. In the negative control dataset, the null hypothesis (of identical tree architectures) should not be rejected. To generate the negative control dataset, we divided a region containing a recombination cold spot (positions 2,224 to 2,586 of the original alignment), which should have had little recombination and therefore have a consistent evolutionary history into two sections (positions 2,224 to 2,400 and 2,401 to 2,586). The log likelihoods of the tree topologies constructed from these sections were −5,895 and −5,735 respectively with a difference of 59.8. Subsequent testing with the SH-test did not reject the null hypothesis of the topologies agreeing (p = 0.21).

Data availability

RNA-seq data have been deposited in the ArrayExpress database⁶⁵ at EMBL-EBI (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-7002. SCMV contigs are available in Genbank under accession numbers MH093717-MH093739.

References

López-Moya, J. J., Valli, A. & García, J. A. Potyviridae. In Encyclopedia of Life Sciences (John Wiley & Sons, Ltd, Chichester, 2009).
Viswanathan, R. & Balamuralikrishnan, M. Impact of mosaic infection on growth and yield of sugarcane. Sugar Tech 7, 61–65 (2005).
Article CAS Google Scholar
Zhu, M. et al. Maize Elongin C interacts with the viral genome-linked protein, VPg, of Sugarcane mosaic virus and facilitates virus infection. The New phytologist 203, 1291–304, https://doi.org/10.1111/nph.12890 (2014).
Article CAS PubMed PubMed Central Google Scholar
Rybicki, E. P. A Top Ten list for economically important plant viruses. Arch. Virol. 160, 17–20, https://doi.org/10.1007/s00705-014-2295-9 (2015).
Article CAS PubMed Google Scholar
Wu, L., Zu, X., Wang, S. & Chen, Y. Sugarcane mosaic virus – Long history but still a threat to industry. Crop. Prot. 42, 74–78, https://doi.org/10.1016/j.cropro.2012.07.005 (2012).
Article Google Scholar
Teakle, D. S. & Grylls, N. E. Four strains of sugarcane mosaic virus infecting cereals and other grasses in Australia. Aust. J. Agric. Res. 24, 465–477 (1973).
Article Google Scholar
Li, L., Wang, X. & Zhou, G. Analyses of maize embryo invasion by Sugarcane mosaic virus. Plant Sci. 172, 131–138, https://doi.org/10.1016/j.plantsci.2006.08.006 (2007).
Article CAS Google Scholar
Perera, M. F., Filipone, M., Noguera, A. S., Cuenya, M. I. & Castagnaro, A. P. An overview of the sugarcane mosaic disease in south america. Funct. Plant Sci. Biotechnol. 6, 98–107 (2012).
Google Scholar
Shi, X. M., Miller, H., Verchot, J., Carrington, J. C. & Vance, V. B. Mutations in the region encoding the central domain of helper component-proteinase (HC-Pro) eliminate potato virus X/potyviral synergism. Virol. 231, 35–42, https://doi.org/10.1006/viro.1997.8488 (1997).
Article CAS Google Scholar
González-Jara, P. et al. A single amino acid mutation in the Plum pox virus helper component-proteinase gene abolishes both synergistic and RNA silencing suppression activities. Phytopathol. 95, 894–901, https://doi.org/10.1094/PHYTO-95-0894 (2005).
Article CAS Google Scholar
Mahuku, G. et al. Maize lethal necrosis (MLN), an emerging threat to maize-based food security in sub-Saharan Africa. Phytopathol, https://doi.org/10.1094/PHYTO-12-14-0367-FI (2015).
Article PubMed Google Scholar
Louie, R. Sugarcane Mosaic Virus in Kenya. Plant Dis. 64, 944, https://doi.org/10.1094/PD-64-944 (1980).
Article Google Scholar
Chen, J., Chen, J. & Adams, M. J. Characterisation of potyviruses from sugarcane and maize in China. Arch. Virol. 147, 1237–1246, https://doi.org/10.1007/s00705-001-0799-6 (2002).
Article CAS PubMed Google Scholar
Wangai, A. W. et al. First report of Maize chlorotic mottle virus and maize lethal necrosis in Kenya. Plant Dis. 96, 1582–1582, https://doi.org/10.1094/PDIS-06-12-0576-PDN (2012).
Article CAS PubMed Google Scholar
Xie, L. et al. Characterization of Maize chlorotic mottle virus associated with maize lethal necrosis disease in China. J. Phytopathol. 159, 191–193, https://doi.org/10.1111/j.1439-0434.2010.01745.x (2011).
Article Google Scholar
Achon, M., Serrano, L., Clemente-Orta, G. & Sossai, S. First report of Maize chlorotic mottle virus on a perennial host, Sorghum halepense, and maize in Spain. Plant Dis. 101, 393 (2017).
Article CAS Google Scholar
Urcuqui-Inchima, S., Haenni, A.-L. & Bernardi, F. Potyvirus proteins: a wealth of functions. Virus Res. 74, 157–175, https://doi.org/10.1016/S0168-1702(01)00220-9 (2001).
Article CAS PubMed Google Scholar
Olspert, A., Chung, B. Y.-W., Atkins, J. F., Carr, J. P. & Firth, A. E. Transcriptional slippage in the positive-sense RNA virus family Potyviridae. EMBO reports e201540509, https://doi.org/10.15252/embr.201540509 (2015).
Article CAS PubMed PubMed Central Google Scholar
Rodamilans, B. et al. RNA polymerase slippage as a mechanism for the production of frameshift gene products in plant viruses of the Potyviridae family. J. virology 89, 6965–6967, https://doi.org/10.1128/JVI.00337-15 (2015).
Article CAS PubMed PubMed Central Google Scholar
Chare, E. R. & Holmes, E. C. A phylogenetic survey of recombination frequency in plant RNA viruses. Arch. Virol. 151, 933–946, https://doi.org/10.1007/s00705-005-0675-x (2006).
Article CAS PubMed Google Scholar
Sztuba-Solińska, J., Urbanowicz, A., Figlerowicz, M. & Bujarski, J. J. RNA-RNA recombination in plant virus replication and evolution. Annu. review phytopathology 49, 415–43, https://doi.org/10.1146/annurev-phyto-072910-095351 (2011).
Article CAS Google Scholar
Bujarski, J. J. Genetic recombination in plant-infecting messenger-sense RNA viruses: overview and research perspectives. Front. plant science 4, 68, https://doi.org/10.3389/fpls.2013.00068 (2013).
Article Google Scholar
Seo, J.-K. et al. Molecular variability and genetic structure of the population of soybean mosaic virus based on the analysis of complete genome sequences. Virol. 393, 91–103, https://doi.org/10.1016/j.virol.2009.07.007 (2009).
Article CAS Google Scholar
Tugume, A. K., Cuéllar, W. J., Mukasa, S. B. & Valkonen, J. P. T. Molecular genetic analysis of virus isolates from wild and cultivated plants demonstrates that East Africa is a hotspot for the evolution and diversification of Sweet potato feathery mottle virus. Mol. Ecol. 19, 3139–3156, https://doi.org/10.1111/j.1365-294X.2010.04682.x (2010).
Article CAS PubMed Google Scholar
Ohshima, K. et al. Patterns of recombination in turnip mosaic virus genomic sequences indicate hotspots of recombination. J. Gen. Virol. 88, 298–315, https://doi.org/10.1099/vir.0.82335-0 (2007).
Article CAS PubMed Google Scholar
Bousalem, M. et al. High genetic diversity, distant phylogenetic relationships and intraspecies recombination events among natural populations of Yam mosaic virus: a contribution to understanding potyvirus evolution. J. Gen. Virol. 243–255 (2000).
Article CAS PubMed Google Scholar
Moreno, I. M. et al. Variability and genetic structure of the population of watermelon mosaic virus infecting melon in Spain. Virol. 318, 451–460, https://doi.org/10.1016/j.virol.2003.10.002 (2004).
Article CAS Google Scholar
Achon, M. A., Serrano, L., Alonso-Dueñas, N. & Porta, C. Complete genome sequences of Maize dwarf mosaic and Sugarcane mosaic virus isolates coinfecting maize in Spain. Arch. Virol. 152, 2073–2078, https://doi.org/10.1007/s00705-007-1042-x (2007).
Article CAS PubMed Google Scholar
Gell, G., Sebestyén, E. & Balázs, E. Recombination analysis of Maize dwarf mosaic virus (MDMV) in the Sugarcane mosaic virus (SCMV) subgroup of potyviruses. Virus Genes 50, 79–86, https://doi.org/10.1007/s11262-014-1142-0 (2015).
Article CAS PubMed Google Scholar
Li, Y., Liu, R., Zhou, T. & Fan, Z. Genetic diversity and population structure of Sugarcane mosaic virus. Virus Res. 171, 242–246, https://doi.org/10.1016/j.virusres.2012.10.024 (2013).
Article CAS PubMed Google Scholar
Moradi, Z., Mehrvar, M., Nazifi, E. & Zakiaghl, M. The complete genome sequences of two naturally occurring recombinant isolates of Sugarcane mosaic virus from Iran. Virus Genes 52, 270–280, https://doi.org/10.1007/s11262-016-1302-5 (2016).
Article CAS PubMed Google Scholar
Padhi, A. & Ramu, K. Genomic evidence of intraspecific recombination in sugarcane mosaic virus. Virus genes 42, 282–5, https://doi.org/10.1007/s11262-010-0564-6 (2011).
Article CAS PubMed Google Scholar
Xie, X. et al. Molecular variability and distribution of Sugarcane mosaic virus in Shanxi, China. PLoS One 11, 1–12, https://doi.org/10.1371/journal.pone.0151549 (2016).
Article CAS Google Scholar
Zhong, Y. et al. Identification of a naturally occurring recombinant isolate of Sugarcane mosaic virus causing maize dwarf mosaic disease. Virus Genes 30, 75–83, https://doi.org/10.1007/s11262-004-4584-y (2005).
Article CAS PubMed Google Scholar
Handley, J. A., Smith, G. R., Dale, J. L. & Harding, R. M. Sequence diversity in the CP coding region of eight sugarcane mosaic potyvirus isolates infecting sugarcane in Australia. Arch. virology 143, 1145–1153, https://doi.org/10.1007/BF01718631 (1998).
Article CAS Google Scholar
Mishra, R., Patil, S., Patil, A. & Patil, B. L. Sequence diversity studies of papaya ringspot virus isolates in south india reveal higher variability and recombination in the 5′-terminal gene sequences. VirusDisease, https://doi.org/10.1007/s13337-019-00512-x (2019).
Article PubMed PubMed Central Google Scholar
Adams, I. P. et al. Use of next-generation sequencing for the identification and characterization of Maize chlorotic mottle virus and Sugarcane mosaic virus causing maize lethal necrosis in Kenya. Plant Pathol. 62, 741–749, https://doi.org/10.1111/j.1365-3059.2012.02690.x (2013).
Article CAS Google Scholar
Braidwood, L. et al. Maize chlorotic mottle virus exhibits low divergence between differentiated regional sub-populations. Sci. Reports 8, 1–9, https://doi.org/10.1038/s41598-018-19607-4 (2018).
Article CAS Google Scholar
Librado, P. & Rozas, J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinforma. 25, 1451–1452, https://doi.org/10.1093/bioinformatics/btp187 (2009).
Article CAS Google Scholar
Pasin, F., Simón-Mateo, C. & García, J. A. The hypervariable amino-terminus of P1 protease modulates potyviral replication and host defense responses. PLoS Pathog. 10, https://doi.org/10.1371/journal.ppat.1003985 (2014).
Article PubMed PubMed Central Google Scholar
Frenkel, M. J. et al. Unexpected sequence diversity in the amino-terminal ends of the coat proteins of strains of sugarcane mosaic virus. J. Gen. Virol. 72, 237–242, https://doi.org/10.1099/0022-1317-72-2-237 (1991).
Article CAS PubMed Google Scholar
Xiao, X. W., Frenkel, M. J., Teakle, D. S., Ward, C. W. & Shukla, D. D. Sequence diversity in the surface-exposed amino-terminal region of the coat proteins of seven strains of sugarcane mosaic virus correlates with their host range. Arch. Virol. 399–408 (1993).
Murrell, B. et al. Detecting individual sites subject to episodic diversifying selection. PLoS genetics 8, e1002764 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kosakovsky Pond, S. L. & Frost, S. D. Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol. biology evolution 22, 1208–1222 (2005).
Article Google Scholar
Shukla, D. D., Strike, P. M., Tracy, S. L., Gough, K. H. & Ward, C. W. The N and C termini of the coat proteins of potyviruses are surface-located and the N terminus contains the major virus-specific epitopes. J. Gen. Virol. 69, 1497–1508, https://doi.org/10.1099/0022-1317-69-7-1497 (1988).
Article CAS Google Scholar
L´opez-Moya, J. J., Wang, R. Y. & Pirone, T. P. Context of the coat protein DAG motif affects potyvirus transmissibility by aphids. J. Gen. Virol. 80, 3281–3288, https://doi.org/10.1099/0022-1317-80-12-3281 (1999).
Article Google Scholar
Firth, A. E. & Brown, C. M. Detecting overlapping coding sequences in virus genomes. BMC Bioinforma. 6, 1–6, https://doi.org/10.1186/1471-2105-7-75 (2006).
Article CAS Google Scholar
Huson, D. H. & Bryant, D. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267, https://doi.org/10.1093/molbev/msj030 (2006).
Article CAS PubMed Google Scholar
Martin, D. P., Murrell, B., Golden, M., Khoosal, A. & Muhire, B. RDP4: Detection and analysis of recombination patterns in virus genomes. Virus Evol. 1, 1–5, https://doi.org/10.1093/ve/vev003 (2015).
Article Google Scholar
Revers, F., Le Gall, O., Candresse, T., Le Romancer, M. & Dunez, J. Frequent occurrence of recombinant potyvirus isolates. J. Gen. Virol. 77, 1953–1965, https://doi.org/10.1099/0022-1317-77-8-1953 (1996).
Article CAS PubMed Google Scholar
Gao, B., Cui, X.-W., Li, X.-D., Zhang, C.-Q. & Miao, H.-Q. Complete genomic sequence analysis of a highly virulent isolate revealed a novel strain of Sugarcane mosaic virus. Virus genes 43, 390–7, https://doi.org/10.1007/s11262-011-0644-2 (2011).
Article CAS PubMed Google Scholar
Posada, D. & Crandall, K. The effect of recombination on the accuracy of phylogeny estimation. J. molecular evolution 54, 396–402, https://doi.org/10.1007/s00239 (2002).
Article ADS CAS Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads (2011).
Article Google Scholar
Krueger, F. Trim Galore!: A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files (2015).
Watson, S. J. et al. Viral population analysis and minority-variant detection using short read next-generation sequencing. Philos. transactions Royal Soc. Lond. Ser. B, Biol. sciences 368, 20120205, https://doi.org/10.1098/rstb.2012.0205 (2013).
Article Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat Methods 9, 357–359, https://doi.org/10.1038/nmeth.1923 (2012).
Article CAS PubMed PubMed Central Google Scholar
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. biotechnology 29, 644–52, https://doi.org/10.1038/nbt.1883 (2011).
Article CAS Google Scholar
Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinforma. (Oxford, England) 27, 2987–93, https://doi.org/10.1093/bioinformatics/btr509 (2011).
Article CAS Google Scholar
Tamura, K., Stecher, G., Peterson, D., Filipski, A. & Kumar, S. MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. biology evolution 30, 2725–9, https://doi.org/10.1093/molbev/mst197 (2013).
Article CAS Google Scholar
Clamp, M., Cuff, J., Searle, S. M. & Barton, G. J. The Jalview Java alignment editor. Bioinforma. 20, 426–427, https://doi.org/10.1093/bioinformatics/btg430 (2004).
Article CAS Google Scholar
Miller, M. A., Pfeiffer, W. & Schwartz, T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. 2010 Gatew. Comput. Environ. Work. GCE 2010, https://doi.org/10.1109/GCE.2010.5676129 (2010).
Huson, D. H. & Scornavacca, C. Dendroscope 3: An interactive tool for rooted phylogenetic trees and networks. Syst. Biol. 61, 1061–1067, https://doi.org/10.1093/sysbio/sys062 (2012).
Article PubMed Google Scholar
Planet, P. J. Tree disagreement: measuring and testing incongruence in phylogenies. J. biomedical informatics 39, 86–102 (2006).
Article CAS Google Scholar
Schliep, K. P. phangorn: phylogenetic analysis in r. Bioinforma. 27, 592–593 (2010).
Article Google Scholar
Kolesnikov, N. et al. Arrayexpress update—simplifying data submissions. Nucleic acids research 43, D1113–D1116 (2014).
Article PubMed PubMed Central Google Scholar
Wickham, H. et al. ggplot2: An implementation of the grammar of graphics. R package version 0.7, http://CRAN.R-project.org/package= ggplot2 (2008).

Download references

Acknowledgements

The authors would like to acknowledge colleagues at KALRO for their assistance in maize sampling, to farmers in all study areas for providing maize samples, and to John Welch for guidance on phylogenetic analyses. L.B. is supported by the BBSRC DTP and the 2Blades Foundation, S.Y.M. was supported by the European Research Council Advanced Investigator Grant ERC-2013-AdG 340642 - TRIBE and D.C.B. is supported by the Royal Society Edward Penley Abraham Research Professorship.

Author information

Authors and Affiliations

University of Cambridge, Department of Plant Sciences, Cambridge, CB2 3EA, United Kingdom
Luke Braidwood, Sebastian Y. Müller & David Baulcombe

Authors

Luke Braidwood
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Y. Müller
View author publications
You can also search for this author in PubMed Google Scholar
David Baulcombe
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.B. performed sampling, sequencing, data analysis, and drafted the paper. S.Y.M. performed data analysis, drafted text, and edited the manuscript. D.C.B. designed experiments and edited the manuscript.

Corresponding author

Correspondence to Luke Braidwood.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

SupplementaryF Information

Supplementary Information 2

Supplementary Information 3

Supplementary Information 4

Supplementary Information 5

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Braidwood, L., Müller, S.Y. & Baulcombe, D. Extensive recombination challenges the utility of Sugarcane mosaic virus phylogeny and strain typing. Sci Rep 9, 20067 (2019). https://doi.org/10.1038/s41598-019-56227-y

Download citation

Received: 05 June 2019
Accepted: 28 November 2019
Published: 27 December 2019
DOI: https://doi.org/10.1038/s41598-019-56227-y

This article is cited by

Sorghum (Sorghum bicolor) a new host to sugarcane yellow leaf and mosaic viruses in India
- R. Viswanathan
- K. Nithya
- D. Visalatchi
Indian Phytopathology (2023)
Maize Lethal Necrosis disease: review of molecular and genetic resistance mechanisms, socio-economic impacts, and mitigation strategies in sub-Saharan Africa
- Akshaya Kumar Biswal
- Amos Emitati Alakonya
- Boddupalli Maruthi Prasanna
BMC Plant Biology (2022)
Complete genome sequence of a novel potyvirus infecting Miscanthus sinensis (silver grass)
- Zacharie Leblanc
- Marie-Emilie Gauthier
- Roberto A. Barrero
Archives of Virology (2022)
Genetic diversity and molecular evolution of sugarcane mosaic virus, comparing whole genome and coat protein sequence phylogenies
- Khalid Muhammad
- Venura Herath
- Jeanmarie Verchot
Archives of Virology (2022)
Incidence and molecular characterization of potato leaf roll virus in seed potato production in Serbia
- Danijela Ristić
- Ivan Vučurović
- Mira Starović
European Journal of Plant Pathology (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.