Rebound of Cotton leaf curl Multan virus and its exclusive detection in cotton leaf curl disease outbreak, Punjab (India), 2015

Cotton leaf curl disease (CLCuD) outbreaks caused by CLCuD associated begomoviruses (CABs) significantly constrain cotton production in India and Pakistan. In comparison to the CABs circulating in Pakistan, molecular epidemiology, evolution and recombination patterns of CABs circulating in India are less studied. In this work, we characterized CAB complex sequences obtained from the most recent outbreak (Punjab, India, 2015), and rigorously analyzed them with reference to GenBank sequences, submitted from India, Pakistan and other neighbouring countries, using contemporary bioinformatics approaches. In this manuscript, we illustrate the detection of a recombinant, phylogenetically distinct clade of Cotton leaf curl Multan virus (CLCuMuV), suggesting rebound of CLCuMuV in this region. Interestingly, we could not detect Cotton leaf curl Kokhran virus-Burewala strain (CLCuKoV-Bu), which was prevalent in this region, until now. Our study thus indicates substitution of the ‘virulent resistance breaking’ CLCuKoV-Bu by the re-emerging CLCuMuV recombinants. Our findings corroborate with that of a very recent study from Pakistan and we here discuss epidemiological links between the CAB complexes reported in these two studies. Taken together, these observations signify a shifting epidemiology of CABs, and seem to correlate with the recent prediction of the ‘third epidemic’ of CLCuD in the Indian subcontinent.

Scientific REPORTS | 7: 17361 | DOI: 10.1038/s41598-017-17680-9 Location of Punjab, and areas worst affected by this outbreak are shown in Fig. 1. We surveyed 10 such cotton fields and collected 50 leaf samples with severe symptoms of CLCuD (downward curling of leaves, vein thickening & yellowing, stunted plant growth etc. (Fig. 1 and Supplementary Fig. S1). Samples were subsequently preserved and transported to the laboratory for further processing. On reaching the laboratory, 20 leaf samples (2 samples from each of the 10 sampled fields) were processed and total DNA was extracted using a commercially available plant DNA extraction kit (Sigma-Aldrich, St. Louis, USA). Quality and quantity of the extracts were checked on agarose gels and spectrophotometer (Picodrop, Cambridge, UK), respectively. In addition to the cotton leaf samples, two samples from unrelated plants (without any begomoviral infection, previously verified by RCA-PCR) and one sample with Tomato leaf curl Bangladesh virus (ToLCBV) and Tomato leaf curl Bangladesh betasatellite (ToLCBB) infection 31 , collected from a distant geographical location (Tezpur, Assam, India, 26.63°N, 92.8°E) were also processed concurrently throughout the experiments. These samples served as controls during the entire experimental procedures.

Rolling Circle Amplification (RCA) and Polymerase Chain Reaction (PCR). Total DNA extracts
were diluted and subsequently subjected to Φ29 DNA polymerase (Fermentas, Villinius, Lithuania) mediated RCA to enrich the begomoviral and satellite DNA components. RCA products were partial digested by incubation with 1 U of restriction endonuclease Hind III (Sigma-Aldrich) for 25 mins at 37 °C following Khan et al. 32 , column purified (Sigma-Aldrich), quantified (Picodrop) and diluted to approximately 50 ng/µL. Enriched RCA products were subsequently used in individual PCR assays for partial or complete amplification of each of the four genetic components, namely the DNA-A, DNA-B, alphasatellite and betasatellite, using previously reported begomovirus complex universal primer pairs 1,33,34 for DNA-A (DengA/ DengB, CRv301/CRc1152, PAR1c496/ PAL1v1978 and PAR1v722/PAL1c1960), DNA-B (BF518/BR1641), alphasatellite (DNA101/DNA102) and betasatellite (Beta01/Beta02). DNA samples amplifiable with universal DNA-A primer pairs were further amplified with additional overlapping primer pairs to sequence the entire genome of the begomoviruses. Details of the PCR primers designed in the present study can be found online as Supplementary Table S1, Fig. S2. Guidelines and precautions to avoid cross contamination during RCA and PCR set-ups and amplicon analyses were strictly followed throughout the experiments 35 . Cloning and Sequencing. Amplicons generated in individual PCR reactions were column purified (Sigma-Aldrich), ligated into T/A cloning vector (Fermentas) and were used to transform XL1-Blue competent bacterial cells. Five-six transformed clones were randomly selected from each transformation experiment, cultured and plasmid DNA was purified (Sigma-Aldrich). Two clones each of the two amplicons denoting subgenomic betasatellites were also prepared. Each plasmid DNA was then diluted and PCR amplified using vector specific primers. PCR amplicons were finally sent to commercial facilities for sequencing (BioServe Biotechnologies, Hyderabad and SciGenom Labs, Kochi, India). Sequence analysis. Sequence electropherograms were manually examined, edited and vector sequence trimmed using Bioedit program version 7.2.5 36 . Sequences generated from individual samples were assembled on the basis of the overlapping sequences (approximately 200 nts), to reconstruct complete genomes. Integrity of the ORFs in the genome sequences were checked using the NCBI ORFfinder algorithm (www.ncbi.nlm.nih.gov/ orffinder/) and the Fangorn-Forest method implemented in the Geminivirus Data warehouse server (http://geminivirus.org:8080/geminivirusdw/) 37 . Sequences were then BLAST 38 searched to identify and retrieve homologous sequences submitted in the GenBank. Additional GenBank sequences submitted from India and neighbouring countries were retrieved using the Geminivirus Data warehouse, or using accession numbers provided in the relevant recent literature 1,5,9,16,37 . To save computational time, redundant sequences were sorted out by running the datasets through the ElimDupes program implemented in the Los Alamos HCV sequence database server 39 (https://hcv.lanl.gov/content/sequence/ELIMDUPES/elimdupes.html) and sequences representing a group of identical or near identical sequences (percent nucleotide identity, PNI ≥ 99%) were selected for further analysis.
Sequence datasets were aligned using the MUltiple Sequence Comparison by Log-Expectation algorithm (MUSCLE) 40 (http://www.ebi.ac.uk/Tools/msa/muscle/), followed by molecular evolutionary analysis using MEGA 41 version 6.06. Evolutionary distances were calculated employing the Maximum Composite Likelihood method and phylogenetic relatedness among the sequences were inferred using the Neighbour-Joining (NJ) algorithm. Reliability of the phylogenetic relatedness of the branches was evaluated by percentages obtained through 1000 bootstrap iteration of the datasets. Ambiguous nucleotides, gaps and missing data were not considered in analysis. Pairwise sequence comparisons were performed using Sequence Demarcation Tool (SDT) 42 , version 1.2.
Since geminiviruses undergo evolution through frequent recombinations 43 , ambiguities in sequence relatedness are common in conventional phylogenies. To prevent such ambiguities and to test reticulate evolution (signifying recombinations), we reconstructed Neighbour Network (NN), using the SplitsTree 44 version 4.14.1, with Kimura-2 parameter distance correction, exclusion of gaps and parsimony uninformative sites. Presence of reticulation in NN has been shown to signify recombination and such analyses are shown to estimate tree topologies more accurately in datasets with recombinant sequences 45 .
To examine the recombination events and probable breakpoints, aligned datasets were analyzed through a set of algorithms, namely RDP, GENECONV, BOOTSCAN, MAXCHI, CHIMAERA, SISCAN, 3SEQ and PHYLPRO, incorporated in the RDP suite version 4.85, using default settings and standard Bonferroni correction 46 . Recombination events detected by at least three or more different algorithms with sufficient statistical support (P < 0.01) were considered as true events. Each of the events was further verified from breakpoint distribution plots and by comparing the UPGMA phylogenetic trees generated with genetic regions from major and minor parents.
Diagnostic PCR for specific detection of CLCuKoV-Bu. To further verify the results obtained from complete genome analysis of the virus from the representative samples, all the samples collected from the fields were subjected to diagnostic PCRs for CLCuMuV and CLCuKoV-Bu. In brief, RCA enriched templates were separately amplified with primer pair CLCuV-1F/ CLCuV-4R (designed on alignment of CLCuMuV sequences obtained in the present study along with GenBank reference sequences) and primer pair CLCuKoV-Bu-F/ CLCuKoV-Bu-R, which was recently designed by Brown et al. 8

Results
Amplification and sequencing of begomovirus and satellite genomes. Initially, we attempted to screen the DNA extracts with universal primers for geminiviruses and satellites using direct PCR approach. Though we could get amplicons of expected size in most of the samples with DNA-A, alphasatellite and betasatellite primer pairs, but amplification was sub-optimal, along with presence of non-specific amplicons. To improve specific amplification, we adopted the RCA mediated circular ssDNA enrichment strategy for viral and satellite genetic components present in the total DNA extracts. In post-RCA PCR, we obtained optimal amplification in all the test DNA extracts with primer pairs for DNA-A, alphasatellite and betasatellite. In addition to amplicons of expected size, some subgenomic amplicons (approximately 0.7 kb and 1.1 kb) were also generated in PCR with betasatellite primers. However, even after repeated attempts we could not amplify DNA-B. Amplification results thus implicated monopartite begomovirus disease complex in our samples. Consequent to amplification, cloning, sequencing and assembling of the amplicons, 32 complete genome sequences each for DNA-A and betasatellite and 30 complete genome sequences for alphasatellite were available for further analyses. Diversity of the sequences and results of BLAST analysis. Sequences were initially screened through the ElimDupes program to identify identical or near identical sequences (≥99% cut-off). Begomovirus sequences were found to segregate into four groups, and one sequence representing each of these four groups was selected for further analyses and submitted to the GenBank (isolates S78-S81; GenBank accession numbers KY120359-KY120362). Similarly, five representative betasatellite sequences were selected for subsequent analyses (isolates S82-S85 and S90; GenBank accession numbers KY081413-KY081416 and KY305676). Detailed features of the representative virus and betasatellite sequences are presented in Table 1. Considering the higher diversity of alphasatellite genome in the present study, 14 representative sequences were analyzed in details (isolates C31, C35-C40 and C43-C48, C51; GenBank accession numbers MF929022-MF929028, MF929030-MF929036). Detailed features of the representative alphasatellite can be found online as Supplementary Table S2.
All the DNA-A sequences had a length of approximately 2.7 kb and a genetic arrangement and ORFs, typical for begomoviruses. Among the DNA-A genomes, sequences KY120359, KY120360 and KY120361 were similar and shared PNI score of 98.1% ± 0.6 (mean ± SD) among themselves. However, these three sequences had a significantly less PNI score (93.0% ± 1.0, P < 0.001) when compared to the sequence KY120362. The difference among the sequences (KY120356 -KY120361 vs. KY120362) was also evident in the BLASTn homology search (megablast). Sequences KY120359, KY120360 and KY120361 were primarily found to be most homologous (94-97% identity at 98-100% query coverage) to three GenBank CLCuMuV sequences isolated from Vehari, Pakistan 1 during 2015 (KX656810, KX656806, KX656809) followed by other sequences isolated from different parts of northwestern India during 2012-14 (KJ959628, KX831888, KM096467, KM096469, KX831891 etc.). On the other hand, sequence KY120362 showed best match (maximum 94% identity at 100% query coverage), with CLCuMuV sequences recently reported from northern and northwestern India (KY561820, KX951461, KX951460, KJ868820, JN678804) followed by sequence from Pakistan (EU384573, AJ002447, AJ002458, AJ496287). Additionally, SDT analysis of the present samples with relevant NCBI RefSeq sequences and type sequences 5 of each of the CABs and their respective strains demonstrated the presence of two different strains of CLCuMuV in the infected samples ( Supplementary Fig. S3).
On the other hand, betasatellite sequences of various lengths including subgenomic sequences with defective or intact coding regions were detected in the present samples. Of the 5 representative sequences, KY081413 -KY081415 denoted typically complete sequences with varying lengths (1361-1378 nts) and with intact βC1 coding region. These sequences shared PNI of 88-95% among themselves and showed close homology (up to 100% query coverage and up to 92-99% identity) with a number of GenBank CLCuMuB sequences, recently isolated from India and Pakistan. However, sequences KY081416 (688 nts) and KY305676 (1171 nts) denoted subgenomic betasatellite sequences and in BLAST analysis, showed maximum homology (94-98% identity) with defective interfering or subgenomic CLCuMuB sequences, recently isolated from Rajasthan and Punjab of northwestern India. SDT analyses results of betasatellite sequences along with NCBI RefSeq betasatellite sequences can be found online as Supplementary Fig. S4.  Interestingly, as compared to the DNA-A and betasatellite sequences, genetic diversity of alphasatellite was found to be considerably high in the present samples. Of the 30 complete alphasatellite sequences, based on the results of ElimDupes program (≥99% cut-off), 14 representative sequences were analyzed for further analyses. Although these sequences have PNI ranging from 67-98% among themselves and varied in length (ranging from 1362-1378 nts), but all the sequences retained an intact alpha Replication initiator protein (Rep) ORF of identical length and an A-rich region of variable length. On BLASTn analysis, these sequences were found to be homologous (100% query coverage and 93-99% identity) to recent GenBank alphasatellite sequences isolated primarily from Pakistan and belonged to four different species, namely, Okra leaf curl alphasatellite (OLCuA), Gossypium darwinii symptomless alphasatellite (GDarSLA), Ageratum yellow vein India alphasatellite (AYVIA) and Tomato leaf curl alphasatellite (ToLCA). Of these ToLCA was the most frequently encountered species within the clonal sequences followed by OLCuA, GDarSLA and AYVIA. Details of alphasatellite sequence BLAST analyses results can be found online as Supplementary Table S2.

Begomovirus
Phylogenetic analysis of the begomovirus DNA-A and satellite sequences. In the NJ phylogenetic tree reconstructed with 244 GenBank DNA-A sequences along with the present sequences (Fig. 2a), the reference sequences segregated according to the clades described recently for CLCuMuV 9 . Corroborating with the BLASTn results, sequences generated in this study were found to be phylogenetically affiliated to CLCuMuV, of which sequences KY120359, KY120360 and KY120361 together with three recent sequences isolated from Vehari, Pakistan 1 (KX656806, KX656809 & KX656810) formed a distinct clade, supported by high bootstrap value (Fig. 2a). This clade further bifurcated into two subclades, differentiating the present sequences from that of Pakistan, signifying their origin from a common progenitor, and their divergence subsequently. This clade was evidently distinct from all the seven other previously reported clades of CLCuMuV sequences 9 , which indicates recent emergence of these sequences. Phylogenetically, this clade was most closely related to the CLCuMuV clade iv, which is represented by sequences originating mostly from India (Fig. 2a). Interestingly, the three Pakistan sequences (KX656806, KX656809 & KX656810) 1 that composed the new clade along with our sequences, were found to be phylogenetically divergent from other CLCuMuV sequences isolated from the same site, during the same study. On the other hand, sequence KY120362 separately clustered with sequences belonging to CLCuMuV, isolated from India (JN678804 & KJ868820) and Pakistan (EU384573).
In NN analysis (Fig. 3), agglomeration of the different clades of CLCuMuV was observed to be analogous to the NJ tree (Fig. 2a). The NN tree demonstrated extensive reticulation among the different clades of CLCuMuV, signifying frequent inter and intragenic recombinations. Nevertheless, in the NN tree also, our sequences clustered with three particular sequences (KX656806, KX656809 & KX656810) from Vehari, Pakistan 1 , and collectively symbolize the emergence of a separate lineage. The basal network of this CLCuMuV lineage appears to indicate its origin from a recombinant ancestor, phylogenetically interspersed between CLCuMuV clades (ii), (iv), (v) and (vi), indicating the contribution of these clades in emergence of the present sequences. An interesting observation of the NN tree was the extremely high reticulation among the CLCuMuV sequences, as compared to other CABs, suggesting relatively higher incidences of genetic recombination occurring among CLCuMuV and its strains.
On the other hand, in the phylogenetic tree generated with CLCuMuB sequences, sequences previously isolated (before 2013) from India and Pakistan appeared to form distinct clusters, although there were a few sequences that clustered irrespective of the country of origin (Fig. 2b). CLCuMuB sequences from Southeast Asian countries were also found to cluster separately, indicating their monophyletic origin. Very interestingly CLCuMuB sequences generated in the present study clustered with a number of CLCuMuB sequences recently isolated from Vehari, Pakistan 1 , along with sequences very recently (after 2013) isolated from other places of northwestern India and Pakistan (Fig. 2b). Particularly one of the present sequences (KY081415) along with a number of betasatellite sequences very recently isolated from different parts of northwestern India (KT228323, KX951462, KY800907, KT228325) were found to cluster separately with the recently denoted CLCuMuB Veh sequences (KX697597 & KX697602) 1 . The observed phylogenetic relatedness of the present sequences, their divergence from previously circulating betasatellite sequences in this region indicates a relatively recent emergence of the present betasatellites sequences, corroborating with the data obtained from DNA-A sequences.
In phylogenetic tree generated with reference GenBank alphasatellite sequences and BLASTn returned homologous sequences, clustering of present sequences was in accordance with the BLASTn results. Interestingly, one of the GDarSLA sequence (MF929023) isolated in the present study was observed to be phylogenetically related to alphasatellite sequences (KX656836, KX656839-KX656842) recently reported in cotton from Pakistan 1 (Fig. 4). Similarly all the ToLCA sequences also clustered with two recent ToLCA sequences from Pakistan (LN829154, LN829155). On the other hand, AYVIA and OLCuA sequences were also found to be phylogenetically affiliated to sequences mainly from Pakistan, along with some Indian isolates.

Analyses of recombination in DNA-A and satellite sequences. Analysis of the present DNA-A
sequences showed clear events of inter-species and intra-species recombinations involving most of the coding regions and strongly supported by statistical signature. Details of the recombination events are summarized in Fig. 5a and Table 2. Specifically, two different patterns of recombination namely R1 (between two different CLCuMuV sequences) and R2 (Between CLCuMuV and CLCuKoV) were common in the three representative sequences analysed here (KY120359, KY120360 and KY120361). On the other hand, events R3 and R4 signify CLCuAlV and CLCuKoV as parental sequence and were common to all the four representative sequences ( Table 2). Except the relative alteration in the site of event R4 in KY120361, probable breakpoints and span of the recombinant fragment was similar in all the events, suggesting a common origin for the sequences. Apart from the above four events, sequence KY120362 had an additional recombinant (R5) fragment (CLCuMuV and CLCuKoV). Among these events, individual recombinant fragment or a combination of fragments were observed in a number of GenBank sequences previously isolated from India and Pakistan. Strongly corroborating with the BLASTn and phylogenetic analyses, sequences recently isolated from Vehari, Pakistan 1 (particularly KX656806, KX656809 and KX656810) had precisely similar combination of events ( Supplementary Fig. S5). Moreover, in two of the events, namely R2 and R4, sequences recently isolated from Vehari, Pakistan served as parents, signifying the epidemiological relationship among the Pakistan sequences with ours. Recombination event R5 also signified recombination involving CLCuMuV and CLCuKoV-Bu sequences from India and Pakistan respectively.
Among the betasatellite sequences generated in this study, four different recombination events were detected. Details of the recombination events are presented in Fig. 5b and Table 3. Events R7 (within the βC1 coding region), R9 and R10 involved CLCuMuB sequences from India, while event R8 and R9 involved Tomato leaf curl Joydebpur betasatellite (ToLCJoB) and Tobacco leaf curl betasatellite (TbLCuB) sequences respectively as minor parents, and included the A-rich and SCR regions.
Similarly, frequent recombination was also observed within the alphasatellite sequences. Evidences of recombinants spanned across the alpha Rep ORF as well as the A-rich region. Our alphasatellite sequences belonging to different species had evidences of GDarSLA, CLCuMuA, CLCuBuA and OLCuA sequences. Most diverse patterns of recombinations were observed within the OLCuA sequences. ToLCA sequences were quite similar to each other in terms of parental sequences, and differed only in the span of recombinant fragment, suggesting a common source of origin. Recombination pattern of the GDarSLA sequences were found to be matching to that of GDarSLA sequences (KX656836, KX656840-KX656842) recently reported from Paksitan 1 . Details of the recombination patterns and statistical features observed in the present alphasatellite sequences can be found online as Supplementary Fig. S6 and Table S3.

Results of diagnostic PCR for CLCuKoV-Bu.
Using diagnostic PCR with primers specific for CLCuKoV-Bu, we could not detect amplicons in any of the sample extracts, collected during the present study. However, amplicons were clearly detectable in PCR tubes with primers designed for amplifying CLCuMuV sequences.

Discussion
Taken together, results of this study demonstrated the exclusive detection of CLCuMuV in plant samples collected from the most recent CLCuD outbreak in Punjab (India). However, we could not detect CLCuKoV-Bu sequences in any of the present samples, although we have used a highly sensitive RCA-PCR based clonal sequencing approach and specific diagnostic PCR assay. This finding was striking, since almost a decade CLCuKoV-Bu has prevailed in the Indian subcontinent and has also been associated with a number of severe outbreaks of CLCuD in India and Pakistan, until recently 1,3,6,8,9,16 . Our results thus connote the rebound of CLCuMuV in this region of the Indian subcontinent, signifying a shifting epidemiology of CLCuD.  relatedness among them. Furthermore, these recent sequences were clearly distinct from sequences previously isolated from the Indian subcontinent. Recombination patterns of the CLCuMuV and a number of associated satellite sequences from both these studies were also precisely similar and were unique, as compared to previously reported sequences. These evidences suggest towards a strong epidemiological connection among the CAB complexes reported in both these studies. Another significant finding of both these studies was the complete non-detection of CLCuKoV-Bu sequences. This finding was quite surprising, since the study by Zubair et al. 1 was done on samples from Vehari, Pakistan, where the CLCuKoV-Bu was first reported 18 . Collectively, these observations provide molecular evidences signifying the re-emergence of CLCuMuV in the Punjab region (of both India and Pakistan), which in turn imply a shift in molecular epidemiology of the predominant CABs in the Indian subcontinent.
Apart from the similarities mentioned above, our findings differed from that of Zubair et al. 1 , in certain aspects. Of these, re-appearance of CLCuKoV and CLCuAlV, along with CLCuMuV in Pakistan 1 is interesting. Although presence of recombination fragments from these two species was indicated in one of our present sequences (KY120362) but we could not detect these two CAB species per se, in our samples. Notably CLCuKoV and CLCuAlV are believed to have emerged in Pakistan, but as compared to the widely distributed CLCuMuV and CLCuKoV-Bu (these two also have evolved in Pakistan), CLCuKoV and CLCuAlV are sporadically detected in India 16,21 . We assume that this differential distribution of CABs might be attributable to the transmission efficiency of the whitefly species prevalent in the Indian subcontinent, since it has recently been demonstrated that transmission efficiency for a given begomoviruses varies significantly with the species of the whitefly 47 . It therefore remains an exciting topic for further research to examine the transmission efficiency of different CABs by Asia II-1 species of B. tabaci, which is the single predominant whitefly species in this part of the Indian subcontinent 2,3 .
On the other hand, alphasatellites ToLCA and OLCuA were frequently detected in our study samples, contrasting the detection of GDarSLA, GDavSLA and CLCuBuA by Zubair et al. 1 . CLCuBuA and CLCuMuA specific fragments were detected only as contributing parents in the ToLCA and OLCuA sequences analyzed in this study.  Table 2 (DNA-A) and Interestingly, all the present ToLCA sequences were found to be homologous (95-98% match over 100% query coverage) and phylogenetically related to two GenBank sequences (LN829154 and LN829155), very recently isolated from cotton from Pakistan. Similarly, most of the OLCuA sequences determined in this study were phylogenetically more related to OLCuA sequences recently reported from Pakistan 48 , as compared to OLCuA sequences recently isolated from northern India 49 . To the best of our knowledge, association of alphasatellites other than CLCuMuA and CLCuBuA with CABs has not been reported in previous CLCuD outbreaks. Altogether, the results of the alphasatellite sequence analyses also attest to the recent epidemiological linkage of different genetic components of the present begomovirus disease complex with those circulating in Pakistan.
Very recently, Sattar et al. 3 analyzed the CLCuD incidence data from 1997-2015, and observed a radical decline in the incidence of the ongoing "Burewala epidemic" during 2014-15. They anticipated this phase as pre-epidemic virus incubation period for the next epidemic in this region. Further, they predicted the probability of involvement of CLCuKoV-Bu (with intact C2), CLCuMuB and ToLCNDV with the next epidemic (third epidemic) of CLCuD in the Indian subcontinent 3 . Based on statistical modelling of the climate data, they also predicted Multan, Pakistan (origin site of the Multan and Burewala epidemics) as the site of emergence of newer CABs and their subsequent spread across the borders 3 . Taking these predictions into consideration, it appears that the reappearance of the previous species/strains in Vehari, Pakistan 1 and the detection of alike sequences with the recent CLCuD outbreak in Punjab, India hallmark the 'third epidemic', as predicted by Sattar et al. 3 . However, in contrast to the predictions by Sattar et al., non-detection of CLCuKoV-Bu and/or other bipartite virus in the present study or in the study by Zubair et al. 1 , we hypothesize that the re-emerging CLCuMuV complex or its descendants will be accountable for impending epidemics in the Indian subcontinent. Our assumptions   Table 3. Statistical features of recombination events observed in complete DNA-β sequence analyzed in the present study. *For KY081414.
also concur with the findings of Brown et al. 8 , demonstrating that certain core CABs, including CLCuMuV and CLCuKoV-Bu are endemic to the Indian subcontinent and are expected to prevail in this region. The study also established that as compared to others, certain CAB complexes undergo rapid genetic diversification, which may lead to expansion of their host range, transmission potentiality, virulence and resistance breaking capability 8 . From this perspective, high genetic diversity of CLCuMuV and satellite complexes as revealed in the present study seem to justify their evolution and re-emergence with novel recombinations. Changing epidemiology of CLCuD in Pakistan side of the Indian subcontinent has been well documented in the recent literature 1,3 . In India too, indications were evident from the results of recent field trials of certain Bt cotton varieties conducted at various locations of northern and northwestern India 16,26,30 . During these trials, a number of cotton varieties that were resistant against CLCuD during crop season 2011-2012 suddenly showed susceptible or highly susceptible reaction during the subsequent crop seasons 26 . Notably, around 90% of cotton crops presently being cultivated in the Indian subcontinent are Bt cotton hybrids, which were primarily introduced to prevent the menace of bollworm complex 3,19,20,25,50,51 . However, a large number of these Bt cotton varieties are either susceptible to CLCuD or their tolerance limits have not been thoroughly scrutinized 25,26,51 . During the present outbreak, it was observed that Bt hybrid cotton varieties (G. hirsutum) were destroyed specifically, while indigenous cotton species (G. arboreum) remained unaffected, which has been observed during previous outbreaks too 19,21,25,26,50,51 . It has been hypothesized that G. arboreum being native to the Indian sub-continent, has developed immunity against the Asian virus complexes due to its long co-evolutionary history, while the varieties of the exotic G. hirsutum, which were introduced into the Indian subcontinent from Mexico have not developed resistance and are still susceptible to Asian CLCuVs 10,14,52 .
Interestingly, recent studies demonstrate that Bt hybrid cotton varieties are non-toxic to hemipteran pests (including whiteflies) and may perhaps augment their populations 53,54 . Additionally, indiscriminate use of pesticides over the years for pest control has induced exceptionally high magnitudes of pesticide resistance among the whitefly populations in the Indian subcontinent, which plays an important role in rapid dispersion of the virus complex causing outbreaks 2,3,25,55 . Nevertheless, widespread adoption of Bt cotton has dramatically changed the pest complex scenario in this cotton ecosystem, signifying a drastic decline in bollworms, while encouraging the emergence of several sap feeding Hemipterans, including whiteflies, aphids, thrips etc. as serious pests, in the recent years 56 . Therefore, immediate re-evaluation of culture and control practices is required for management of Bt-cotton, where threat of CABs is significantly greater as compared to bollworms.
India and Pakistan rank third and fourth respectively from the perspective of global cotton production (FAOStat, 2013) 57 . Economies of both these countries greatly depend on cotton production. Due to the geographical immediacy of India and Pakistan, viral species and recombinants evolving in one of these regions spread rapidly to the other 1,3,6,9,10 . Although CAB complexes circulating in Pakistan are well characterized, there is a scarcity of scientific literature on the genetic diversity of CAB complexes in the Indian side. To the best of our knowledge, this is the first study from India to report in-depth genetic characterization of the CAB complex directly isolated from outbreak stricken fields. Overall, results of this study correlate with recent literature and field studies, signifying a changing epidemiology of CLCuD in this region.
In conclusion, our results strongly emphasize the necessity of stringent and regular monitoring of the prevailing CAB complexes in this region. This will definitely help in early detection of emerging or re-emerging etiological viral agents, thereby averting future outbreaks. Apart from the viruses, it is also equally important to rigorously screen the cotton varieties, especially the Bt cotton varieties, targeted for cultivation in a given region, against CABs prevalent in that region. Varieties found to be resistant in one geographical region should not be implied in other regions and rigorous field trials under natural conditions must be performed before their release for mass cultivation. In addition, resistant nature of the released varieties needs to be validated more often to identify any sudden change in susceptibility and to correlate such events with the molecular epidemiology of the CAB complexes. In parallel, cultivation of the indigenous cotton varieties should be encouraged, since these varieties have repeatedly been seen to be highly resistant to CLCuD in this region. Finally, there is an urgent need to revisit, reformulate and enforce sustainable agriculture trade practices, to prevent outbreaks of this socio-economically devastating disease in future.
We wish to dedicate this research to the everlasting memory of our colleague and co-author of this work, Raghvendra Budhauliya, whose untimely departure is an irrevocable loss to us all.