Analysis of strain distribution, migratory potential, and invasion history of fall armyworm populations in northern Sub-Saharan Africa

Fall armyworm (Spodoptera frugiperda J.E. Smith) is a noctuid moth pest endemic throughout the Western Hemisphere that has recently become widespread in sub-Saharan Africa. There is a strong expectation of significant damage to African maize crop yield and a high likelihood of further dispersal, putting the rest of the Eastern Hemisphere at risk. Specimens from multiple locations in six countries spanning the northern portion of the infested region were analyzed for genetic markers. The similarity of haplotypes between the African collections was consistent with a common origin, but significant differences in the relative frequency of the haplotypes indicated limitations in migration. The mitochondrial marker frequently used to identify two host strains appears to be compromised, making uncertain previous reports that both strains are present in Africa. This more extensive study confirmed initial indications based on Togo populations that Florida and the Greater Antilles are the likely source of at least a subset of the African infestation and further suggest an entry point in western Africa. The origin of a second subgroup is less clear as it was rarely found in the collections and has a haplotype that has not yet been observed in the Western Hemisphere.

predicted fall armyworm migration patterns as validated by genetic haplotype studies 9 . However, because climate, farming practices, and wind patterns are substantially different from North America, fall armyworm migratory behavior in Africa is uncertain, which is a considerable limitation to the development of regional and area-wide strategies for sustainable pest management.
Western Hemisphere fall armyworm consists of two subpopulations that are morphologically indistinguishable but differ in their host plant distribution and certain physiological features [10][11][12] . The rice-strain (R-strain) is most consistently found in millet and grass species associated with pasture habitats while the corn-strain (C-strain) prefers maize and sorghum [13][14][15] . The strains can only be identified using molecular markers whose association with specific host plants is generally consistent but not absolute 13,16,17 . It is not known whether this variability reflects inaccuracy of the markers or plasticity in strain behavior. Contributing to the uncertainty is evidence that the strains are capable of cross-hybridization in the field, though at a significantly reduced frequency relative to within strain matings 18,19 . The behavior of these hybrids is not well understood, but they are preferentially found in maize habitats and so appear to retain a C-strain host bias 20 .
The C-strain can be subdivided into two geographically divided subgroups (FL-type and TX-type) on the basis of differences in the frequency of mitochondrial haplotypes 21,22 . The TX-type profile is found in most of the Western Hemisphere, with the FL-type limited to Florida and the Caribbean, as well as populations that migrate from Florida to annually infest the eastern coast of the U.S. 1,[23][24][25] . The C-strain fall armyworm collected from the African country of Togo had a haplotype pattern most consistent with the FL-type 19 . Analogous haplotype differences for the R-strain have not yet been found, so it is unclear whether this group shows a similar geographical segregation.
In this paper we expand on the previous Togo study to include specimens from multiple locations in the island nations of São Tomé and Príncipe (Sao Tome) that lies approximately 300 km from the Africa mainland, Burundi, Democratic Republic of Congo (D.R. Congo), Tanzania, and Kenya (Table 1). Genetic analysis was performed to confirm the fall armyworm species identification, determine host strain identity, and identify polymorphisms that can distinguish between geographically separated African populations. Comparisons between the haplotypes found in the different African locations and with those prevalent in the Western Hemisphere were used to extrapolate likely migratory source locations and the magnitude of fall armyworm population movements. The implications of these results on the origins of the fall armyworm introduction into Africa, the behavior and presence of the host strains, and migration patterns are discussed.

Results
Fall armyworm identified by morphological criteria were analyzed by sequence analysis of segments from the presumptive coding region of the mitochondrial Cytochrome Oxidase Subunit I (COI) gene and the sex-linked Triosephosphate isomerase (Tpi) gene. All sequence variations observed were single base substitutions that did not alter the predicted amino acid sequence (Fig. 1). Little COI variation in Africa. The CO1A segment was used to confirm species identification by comparisons with barcode sequences in GenBank. Of the 188 African specimens analyzed for CO1A, three displayed barcodes inconsistent with fall armyworm (Fig. 2). Two from Togo (TogAL12.5 and TogAL13.2) shared a haplotype most similar to Sesamia calamistis Hampson (Lepidoptera: Noctuidae) and one from Kenya (KenAL65.1) clustered with Busseola fusca Fuller (Lepidoptera: Noctuidae). Both species are native to Africa and are pests of maize and sorghum. The remaining specimens contained two CO1A haplotypes identical to that found in a previous study of Togo fall armyworms 26 . One sequence was identical to a variant of the C-strain group (COI-CS01) and the other to a R-strain haplotype (COI-RS09) that are both common in the Western Hemisphere 27 . The COI strain grouping determined by CO1A was confirmed by analysis of the downstream segment COIB. Again, only two haplotypes were observed, one for each strain. These differed at the same 12 sites that differentiate the consensus COI-CS and COI-RS haplotypes derived from Western Hemisphere populations (  (Fig. 3A,B). In comparison, the 70 C-strain fall armyworm specimens characterized from Africa were all of the G 1164 G 1287 haplotype, consistent with the results of 43 specimens previously examined from Togo 26 (Fig. 3C).
Disagreement between COI and Tpi markers. All specimens from Africa were collected from maize or sorghum, host plants associated with the C-strain population. Yet surprisingly the majority of specimens (72%, 180/250) carried the R-strain COI-RS haplotype rather than the expected COI-CS majority observed in the Western Hemisphere collections ( Table 2). A subset of these was additionally tested for the Tpi strain markers, which gave a very different result. The C-strain Tpi-C haplotype as defined by C 183 was found in 86% (203/243) of the African specimens with only three displaying the R-strain Tpi-R marker (Table 3A). These frequencies were comparable to that observed in the previous study of Togo fall armyworm and similar to that found in Western Hemisphere collections from analogous C-strain host plants (Table 3A).

Geographical differences in the distribution of COI and Tpi haplotypes. The sex-linked Tpi
gene segregates independently from the maternally-inherited mitochondrial COI gene. Despite this, Western Hemisphere fall armyworm populations consistently displayed high levels of concordance between the markers, i.e., marker configurations of CO1-CS Tpi-C and COI-RS Tpi-R, which suggests strong biases for mating within rather than between strains (Fig. 4A). In pooled collections from C-strain host plants in the Americas, 51% were CO1-CS Tpi-C and 26% COI-RS Tpi-R, giving a concordance frequency of 77%. This Western Hemisphere profile showed a significant positive correlation with that found for the two western Africa nations of Togo and Sao The distribution of the COI-CS haplotype showed substantial differences between collections that also appeared to be aligned along an east-west axis (Fig. 5A). The COI-CS group was the majority in collections from the two most western African nations tested (Togo, Sao Tome) and progressively declined eastward, with the two most eastern locations (Kenya and Tanzania) showing a frequency of less than 10%. The mean frequency for Togo and Sao Tome was significantly different from that of the more eastern collections.
Two additional sites in the Tpi exon were previously shown to be highly variable in the Western Hemisphere but with much reduced strain specificity 19 . Sites gTpi192Y and gTpi198Y were similarly polymorphic in the Africa collection, subdividing the Tpi-C group into two haplotypes designated Tpi-Ca1 (with C 192 C 198 ) and Tpi-Ca2    (Table 2B). All three of these haplotype groups were also found in the Western Hemisphere at similar proportions. Once again there is evidence of east-west differences in frequency, with populations from Togo and Sao Tome showing the highest Tpi-Ca2 frequencies that were significantly different from the other collections (Fig. 5B).

The African Tpi-R haplotype appears unique. Comparisons between Tpi-Ca1 and Tpi-Ca2 with the
African Tpi-R variant (designated Tpi-Ra1) identified five differing sites in addition to the strain diagnostic gTpi183Y. Two sites, gTpi165Y and gTpi168Y, exhibit strain-specific variation in Western Hemisphere populations with >80% of Tpi-R specimens showing a different nucleotide than found in Tpi-C (Table 3BC). In contrast, only gTpi168Y differed between the African Tpi-C and Tpi-R variants (Table 3BC). In addition, Tpi-Ra1 differed from both Tpi-Ca1 and Tpi-Ca2 at three sites (gTpi129Y, gTpi144R, and gTpi180S) that are not strain-specific and are infrequently variable in the Western Hemisphere collections. To date the African Tpi-Ra1 haplotype has not been found in a survey of 699 specimens from the Western Hemisphere (Table 3C). The DNA sequence chromatographs for the Tpi-h group showed overlapping peaks at each of the sites that differed between the two African strain haplotypes (Table 3D). This pattern is consistent with heterozygosity between Tpi-Ra1 and either Tpi-Ca1 or Tpi-Ca2, indicating that the two African host strains as defined by Tpi can productively hybridize.

Discussion
In this and a previous study 26 , approximately 300 specimens from six African nations have been genetically characterized for markers in the COI and Tpi genes. The results show very little genetic variability in the African fall armyworm population with only two COI and three Tpi haplotypes found. Both COI variants and the two Tpi-Ca haplotypes were observed in all Africa locations and are common forms in the Western Hemisphere. In contrast, the African Tpi-R variant, Tpi-Ra1, is so far unique to Africa (Table 3). Only four Tpi-Ra1 specimens were found, two from Togo (both COI-RS), one from D. R. Congo (COI-CS), and one from Kenya (COI-RS), while the putative Tpi-Ra1/Tpi-CSa heterozygotes (the Tpi-h category) were present in all locations. Therefore, while the Tpi-Ra1 haplotype is rare in the collections from maize and sorghum, it is widely disseminated. These results in conjunction with the first detection of African fall armyworm infestations in 2016-2017 26 are consistent with a recent introduction of a small invasive C-strain population that migrated rapidly throughout the northern Sub-Saharan region of the African continent. The history of the Tpi-Ra1 subgroup is less clear, as too few have been collected to test for regional differences and its presence in the Western Hemisphere has yet to be demonstrated.
While the fall armyworm in the surveyed portion of Africa shared a common source, geographical differences in haplotype frequencies suggest that long distance movements between distant locations were not of sufficient magnitude or consistency to homogenize geographically distant populations. Geographical structure is indicated by three haplotype distributions, each consistent with dispersion along an east-west axis (Figs 4 and 5). If these differences persist over time it would indicate that annual migrations in Africa tend to be more regional than continental, with the described marker combinations providing a means to define migratory limits and the area sharing a common population.
Current data are too preliminary to make definitive conclusions about either the Western Hemisphere source(s) of the invasive population or their site(s) of introduction into Africa. Multiyear data are needed as well as surveys from southern Africa. Nevertheless, the results provide some preliminary indications. The distribution of concordant and discordant COI and Tpi marker combinations vary considerably across Africa, with the frequency profile of the Western Hemisphere positively correlating with that of the most western collection sites (Togo and Sao Tome) but not to the other more eastern locations (Fig. 4). This finding is most consistent with fall armyworm from the Western Hemisphere first entering through western Africa.
There is also evidence that identify the likely Western Hemisphere source of the Africa fall armyworm. All 108 of the CO1-CS fall armyworm from Africa analyzed at COIB have the G 1164 G 1287 SNP configuration. This is the predominant CO1-CS variant found in the Greater Antilles and Florida, with the latter being the source of migratory infestations in the eastern United States, but is rare in South America and relatively infrequent in Texas and the central regions of the United States. Therefore, we believe that the most likely source of the surveyed COI-CS fall armyworm in at least northern Sub-Saharan Africa are populations that normally winter in the Caribbean or Florida. This was first indicated by our earlier genetic survey of Togo 26 , with the additional collections confirming that finding and suggestive of a single introduction for the northern Sub-Saharan region. If multiple introductions occurred, it appears they all originated from the same Western Hemisphere source. The origin of the Tpi-Ra1 haplotype is not clear and its apparent scarcity in the Western Hemisphere is surprising. The few Tpi-Ra1 specimens found carry the same two CO1 haplotypes as the African CO1-CS group, suggesting that both originated from a common source. Therefore, we believe the most parsimonious explanation is a single introduction of both strains into Africa with a subset happening to carry a rare Tpi haplotype variant. However, we cannot completely discount the possibility that the Tpi-Ra1 subgroup may not be present in the Western Hemisphere and so has a different origin, perhaps even representing a fall armyworm population native to Africa. Furthermore, it is not clear whether the Tpi-Ra1 group is simply a variant of the African C-strain population (i.e., the gTpi183Y site is not strain-specific in Africa) or is representative of a more abundant R-strain that is present in habitats not yet surveyed. Characterization of additional Tpi-R specimens from Africa and surveys of different habitats should clarify this issue.
This and other studies have shown that the COI-RS haplotype is present throughout Africa and is the predominant group in some locations 2,3,26 . However, two observations indicate that COI-RS is not behaving as an accurate marker of the R-strain. The first is its high frequency in collections from traditional C-strain host plants (maize, sorghum), particularly in the central and eastern collection sites (Fig. 5A). While such events have been reported in the Western Hemisphere, they are not the norm 14,17,28 . The second is the predominance of the discordant configuration in parts of Africa, indicating dissociation between the COI and Tpi strain haplotypes (Fig. 4C,D). The reason for this loss of COI strain specificity can only be speculated, but could result from interstrain hybridization producing discordant configurations that by chance became predominant. Overall, we believe the data to date leaves uncertain whether the R-strain is present in Africa, particularly since the collections were limited to host plants preferred by the C-strain and so may not be representative of the total African population.
In conclusion, genetic characterization of portions of the COI and Tpi genes demonstrate a common origin for the recent infestation of fall armyworm over the northern portion of Sub-Saharan Africa and provide preliminary indications of where in Africa fall armyworm was first introduced and the likely Western Hemisphere source population. Haplotype distribution profiles identify differences between geographical populations that have the potential for defining fall armyworm migration patterns. This will have important ramifications for the design of area-wide strategies to mitigate infestations.

Methods
Specimen Collections and DNA preparation. Specimens were obtained as larvae from corn (maize) or sorghum plants at various locations in northern Sub-Saharan Africa at various times in 2016 (Table 1). Specimens were stored either air-dried or in ethanol at room temperature. A portion of each specimen was excised and homogenized in a 5-ml Dounce homogenizer (Thermo Fisher Scientific, Waltham, MA, USA) in 800 µl Genomic Lysis buffer (Zymo Research, Orange, CA, USA) and incubated at 55 °C for 5-30 min. Debris was removed by centrifugation at 10,000 rpm for 5 min. The supernatant was transferred to a Zymo-Spin III column (Zymo Research, Orange, CA, USA) and processed according to manufacturer's instructions. The DNA preparation was increased to a final volume of 100 µl with distilled water. Genomic DNA preparations of fall armyworm samples from previous studies were stored at −20 °C.
Genetic markers and nomenclature. The genetic markers are all single nucleotide substitutions. Sites in the COI gene are designated by an "m" (mitochondria) while Tpi sites are designated "g" (genomic). This is followed by the gene name, number of base pairs from the predicted translational start site (COI) or 5′ start of exon (Tpi), and the nucleotides observed using IUPAC convention (R: A or G, Y: C or T, W: A or T, K: G or T, S: C or G, D: A or G or T). Sites gTpi165Y, gTpi168Y, and gTpi183Y were previously described as 352, 355, and 370 20,29 . Characterization of the CO1 and Tpi gene segments. The COI markers are from the maternally inherited mitochondrial genome. Two adjacent segments of CO1 were analyzed by DNA sequencing. The COIA segment was amplified by the COI primers JM76 and JM77 and used to confirm species identity and to determine the fall armyworm COI strain haplotypes COI-CS (C-strain) and CO1-RS (R-strain) 30 . The DNA sequences of the fall armyworm host strains and other Spodoptera species were previously described and available in GenBank 27 . DNA alignments and consensus building were performed using MUSCLE (multiple sequence comparison by log-expectation), a public domain multiple alignment software incorporated into the Geneious Pro 10.1.2 program (Biomatters, New Zealand, http://www.geneious.com) 31 . Phylogenetic trees were graphically displayed in a neighbor-joining (NJ) tree analysis also included in the Geneious Pro 10.1.2 program 32 . The adjacent COIB segment was amplified by CO1 primers 891 F and 1472 R and used to confirm host strain identity and determine the region-specific haplotypes 24 (Fig. 1).
Variants in the Triosephosphate isomerase gene (Tpi) can also be used to identify host strain identity with results generally comparable with the CO1 marker [30,31]. The gTpi183Y site is on the fourth exon of the predicted Tpi coding region and was PCR amplified using the Tpi primers 412 F and 850 R (Fig. 1). The C-strain allele (Tpi-C) is indicated by a C 183 and the R-strain (Tpi-R) by T 183 20 . The Tpi gene is located on the Z sex chromosome that is present in one copy in females and two copies in males. Since males can be heterozygous for Tpi, there is the potential for the simultaneous display of both alternative nucleotides at Tpi 183 (denoted as Tpi-h), which would be indicated by an overlapping C and T DNA sequence chromatograph 20 .
For fragment isolations, 6 µl of 6X gel loading buffer was added to each amplification reaction and the entire sample run on a 1.8% agarose horizontal gel containing GelRed (Biotium, Hayward, CA) in 0.5X Tris-borate buffer (TBE, 45 mM Tris base, 45 mM boric acid, 1 mM EDTA pH 8.0). Fragments were visualized on a long-wave UV light box and manually cut out from the gel. Fragment isolation was performed using Zymo-Spin I columns (Zymo Research, Orange, CA) according to manufacturer's instructions. The University of Florida Interdisciplinary Center for Biotechnology (Gainesville, FL) and Genewiz (South Plainfield, NJ) performed the DNA sequencing. Data Availability. All data generated or analysed during this study are included in this published article.