Variation in infection length and superinfection enhance selection efficiency in the human malaria parasite

The capacity for adaptation is central to the evolutionary success of the human malaria parasite Plasmodium falciparum. Malaria epidemiology is characterized by the circulation of multiple, genetically diverse parasite clones, frequent superinfection, and highly variable infection lengths, a large number of which are chronic and asymptomatic. The impact of these characteristics on the evolution of the parasite is largely unknown, however, hampering our understanding of the impact of interventions and the emergence of drug resistance. In particular, standard population genetic frameworks do not accommodate variation in infection length or superinfection. Here, we develop a population genetic model of malaria including these variations, and show that these aspects of malaria infection dynamics enhance both the probability and speed of fixation for beneficial alleles in complex and non-intuitive ways. We find that populations containing a mixture of short- and long-lived infections promote selection efficiency. Interestingly, this increase in selection efficiency occurs even when only a small fraction of the infections are chronic, suggesting that selection can occur efficiently in areas of low transmission intensity, providing a hypothesis for the repeated emergence of drug resistance in the low transmission setting of Southeast Asia.

infectious to mosquitoes than single genotype infections 9,16 . It has also been hypothesized that multiple parasites may be co-transmitted via mosquitoes 17,18 . Indeed, many theoretical frameworks assume that parasite genotypes are sufficiently antigenically divergent that they circulate essentially independently of each other [19][20][21][22] . Frequent superinfection provides opportunities for competition among parasite lineages, whether via cross-reactive immune responses or in competition for red blood cells, both within-host and during transmission. This additional layer of complexity is not easily accommodated in standard theoretical and evolutionary frameworks such as the Wright-Fisher model 23,24 .
The impact of these characteristics of malaria infections on the evolutionary capacity of the parasite is unclear, despite the central role that adaptation plays in its epidemiological success. We have previously shown that the within-host parasite population expansion and repeated bottlenecks imposed by the malaria parasite life cycle during acute infections have a profound impact on the population genetics of the parasite 25,26 , including reducing the probability of fixation of beneficial mutations and the power to detect positive selection. However, genomic analyses from field isolates were still able to detect genes under positive selection [27][28][29][30][31] , and drug resistance has emerged multiple times [32][33][34][35][36] . Moreover, the majority of drug resistant parasites have emerged in Southeast Asia 32,[34][35][36] , an area of low transmission with an assumed low effective population size, contrary to the predictions of classic population genetic models.
We hypothesize that these paradoxical findings might arise because, in addition to exponential expansion and bottlenecks, chronicity and superinfection create additional differences from the Wright-Fisher model. In particular, chronic infections allow for multiple transmission events over several weeks or months, violating the assumption of "non-overlapping generations" in typical population genetic models 23,24 , and different mixtures of short and long infections change the epidemiological dynamics that underlie adaptation. Here, we develop the first stochastic population genetic model that incorporates characteristic features of the malaria parasite lifecycle, including repeated within-host population expansion and between-host bottlenecks, superinfection and variation in infection length, and within-host competition. We show that the probability of fixation of beneficial alleles is higher in chronic infections, and that fixation is more likely-and occurs more rapidly-when superinfection is included. We then analyze a mixed population composed of both acute and chronic infections, which is likely to represent most malaria-endemic settings, and show that having variable infection dynamics increases the probability of fixation of beneficial alleles; unexpectedly, this increase is further enhanced when the proportion of chronic infections is low. Our results have important implications for our understanding the evolutionary dynamics of the malaria parasite, and other pathogens that cause both acute and chronic infections.

Results
Chronic infections exhibit higher probability of fixation than acute infections. To characterize the effect of long-lived infections on malaria parasite evolution, we first compare a model with only acute infections to a model with only chronic infections (see Methods). We applied idealized infection models (see Fig. 1A), which reflect trends recorded during experimental infections of humans and observations from field settings 8,9,37 . In order to compare between models we keep the number of infected human and mosquito hosts constant, which is equivalent to assuming that the prevalence of infection (a common measure of transmission intensity) is at equilibrium. That is, the average number of newly infected hosts in each iteration is equal to the average number of infected hosts who die or clear infections, i.e. an effective reproductive number of unity (R e = 1). We compare the probability that a beneficial mutation occurring on the same day post-infection in the two models becomes fixed in the population of hosts, as well as the time to reach this fixation. We examine selection of alleles that are beneficial within-host (s h + t m 0 ) and during the transmission between hosts (s h 0 t m + ) separately, and when they are beneficial both within-host and between-hosts (s h + t m + ) or are beneficial in one host but deleterious in the other (trade-off models, s h − t m + or s h + t m − ). Selection coefficients during infection within the host versus between hosts are denoted, respectively, by s h and t m . We vary s h between 0.1, 0, and −0.01, corresponding to s h  (Table S1). Chronicity promotes the fixation of beneficial mutations occurring during the exponential growth phase of infection, and the probability of fixation increases with the length of infection, although the time to fixation also increases with infection length (Figs S1A and S2A), consistent with the reduced incidence in populations with longer infections. The selection models incorporating within-host advantage but not between-host disadvantage (s h + t m 0 and s h + t m + ) show this pattern, but the model with only between-host advantage (s h 0 t m + and s h − t m + ) does not. In selection models incorporating within-host advantage but not between-host disadvantage (s h + t m 0 and s h + t m + ), because the frequency of beneficial alleles at the end of infection is greater in the chronic-infection model (Table S2), the probability of passing beneficial alleles to the next host(s) during the transmission bottleneck is greater, and the probability of fixation is greater. In the selection model with only between-host advantage (s h 0 t m + ), mutation frequencies within the host neutrally fluctuate during infection, and the probability of fixation is not influenced by the length of infection. In the trade-off model with between-host advantage but within-host disadvantage (s h − t m + ), the probability of fixation decreases with the length of infection because the frequency of the mutation decreases over time within the host. In the trade-off model with within-host advantage but between-host disadvantage (s h + t m − ), the probability of fixation is always smaller than 10 −6 (Fig. S2A). The probability of fixation remains so low because even if a mutation becomes fixed in an individual host, there is selection against transmission of this within-host fixation. S2B). Including superinfection in addition to chronicity in the malaria model leads to a probability of fixation greater than predicted from the Wright-Fisher model 25 , supporting the observations of frequent adaptive evolution in malaria parasites. Here, we assume that superinfection could occur at any point during infection, and originate from any other infectious host. Superinfection does not impact the acute model, although it is included, because parasites from secondary infections are not mature by the time of transmission in acute infections. Superinfection leads to an increased acquisition of infections per host as well as an increased number of outgoing transmissions, and promotes competition between alleles, increasing the probability of fixation in all selection models except the trade-off between-host model (s h − t m + ). In that model, superinfection increases the competition between alleles within the host, and, because the mutation is deleterious within the host, reduces the probability of fixation.

Superinfection greatly increases selection efficiency. Superinfection dramatically increases the probability of fixation for all selection models except trade-off between-host model (s
Interestingly, the speed of fixation shows a complex relationship with infection length and mode of selection. The time to fixation decreases with the length of infection for the models of within-host advantage (s h  (Fig. S2B). In this case, despite the reduced clearance rate, a beneficial allele can be transmitted to both new infections and to existing infections, without "waiting" for clearance of an infection with only the wild-type allele. Because superinfection brings wild-type and mutant alleles together frequently and facilitates direct competition between them, the time to fixation decreases dramatically with superinfection. In the models with between-host advantage but lacking within-host advantage (s h 0 t m + and s h − t m + ), the time to fixation increases with the length of infection (Fig. S2B). Since there is no selection advantage or even disadvantage within the host in the models with only between-host advantage (s h 0 t m + and s h − t m + ), the frequency of beneficial alleles fluctuates stochastically or decreases within the host. Even if a mutation reaches fixation within the host by chance, superinfection can still subsequently introduce wild-type alleles, lowering the speed of fixation.
Infection length variation in the host population enhances selection efficiency. The epidemiology of most malaria-endemic regions is characterized by infections of varying lengths. Although the relative proportion of short-and long-lived infections in different transmission settings is poorly understood, it is generally assumed that the fraction of chronic infections will increase with transmission intensity as increasing fractions of the population become semi-immune. We therefore extend our model to investigate the probabilities of fixation in populations with the same number of infected human hosts (that is, the same prevalence) but different proportions of chronic infections. We initially assume equal infectiousness of acute and chronic patients (the default value of the relative infectiousness of acute to chronic infections, B, is 1) to reflect the uncertainty about the relationship between parasite density and infectiousness to mosquitoes 13,15,38 . In the mixed model, we assume that new infections originate from either chronic or acute, and we vary the fraction of chronic infections in the populations.
In this scenario, two opposing forces shape the probability of fixation (Fig. 2): an increasing fraction of infections that are long-lived enhances allelic competition within the host, but it also changes the dynamics of transmission to reduce the contribution of chronically infected individuals to the infectious reservoir (Fig. S3). As a result, we find that the probability of fixation actually decreases with the fraction of chronic infections, regardless of the inclusion of superinfection in the model (Fig. 3). Here, in the populations where long-lived infections constitute the majority of the infectious reservoir, although within-host competition between alleles is very high, the number of newly infected hosts per unit time is relatively low because most people are already infected (Fig. S3). When the proportion of chronic infections is low, on the other hand, the parasite population benefits both from the increased probability that a beneficial mutation will compete successfully and be selected and transmitted at multiple time points during a long infection, and also from the frequent availability of new susceptible hosts due to cleared acute infections. With superinfection, the contribution of chronically infected individuals to the infectious reservoir decreases less with the proportion of chronic infections, and we observed a smaller change in the probability of fixation with the proportion of chronic infections (Fig. 3B). Assuming that low transmission intensity is associated with a substantial fraction of acute infections due to low population-level immunity 39 , our results imply that we expect the highest probability of fixation of beneficial alleles in regions with low transmission intensity.
This association is not sensitive to the total number of infected human hosts overall or the incidence of infection (Fig. S4), although the model of between-host advantage (s h 0 t m + ) is impacted by stochastic fluctuations in allele frequency within the host. We compared two cases with the same level of incidence (i.e., similar numbers of newly infected hosts in each iteration), and the probability of fixation is consistently smaller in the model with larger proportion of chronic infections (Table S3). Given that acute infections may be characterized by high parasite densities and potentially higher infectiousness, we increased the relative infectiousness of acute to chronic infections (B). Even when acute infections are twice as infectious as chronic infections, the relationship between the probability of fixation and the proportion of chronic infections remains the same (Fig. S5). Note that if acute infections are orders of magnitude more infectious, we expect the balance of forces shown in Fig. 2 to shift.
Time to fixation showed a relatively complex dependence on the type of selection occurring and the inclusion of superinfection (Fig. 3). Although it was not significantly impacted by whether the beneficial mutation occurred in acute or chronic infections, time to fixation tends to increase with the proportion of chronic infections (Fig. 3) ) both acted to accelerate the speed of fixation. Within-host advantage increases the frequency of beneficial alleles within the host, and superinfection enhances the number of transmissions per unit time as well as competition between beneficial and wild-type alleles. Again, the mixed model with superinfection but without within-host advantage (s h 0 t m + and s h − t m + ) shows the slowest speed of fixation because the mutation fluctuates neutrally or decreases within the host and superinfection acts to slow down the within-host fixation process by bringing wild-type alleles into the host.

Discussion
We have shown using a population genetic model that several aspects of the lifecycle of Plasmodium falciparum, namely chronicity and superinfection, combine to enhance selection efficiency for beneficial mutations, particularly for mutations conferring advantage within the host. Mutations occurring in the chronic-infection only model offering within-host advantage have higher probability of fixation than those occurring in the acute-infection only model (Fig. 4A). Superinfection further increases selection efficiency of mutations occurring in chronic patients, making both probability and speed of fixation higher in the chronic-infection model (Fig. 4B). Superinfection also dramatically increases the probability of fixation in the model with acute and chronic infections (Fig. 4C). Interestingly, selection is most efficient in the model with both acute and chronic infections, and the association between the probability of fixation and the proportion of chronic infections is negative (Fig. 3).
Our results imply that ignoring superinfection and overlapping generations due to the variation in duration of infection strongly biases our understanding and prediction of adaptation of malaria parasites. For instance, previous models ignoring superinfection and chronicity predicted a lower probability of fixation for beneficial mutations and lower ability to detect positive selection compared to the Wright-Fisher model 25,26 , while empirical genomic analysis detected signals of positive selection. Including both chronicity and superinfection in our model leads to a much greater probability of fixation, reconciling the discrepancy between model predictions and observations.
These results have important implications for the emergence of drug resistance mutations. Resistance to chloroquine, sulphadoxine-pyrimethamine, mefloquine, and artemisinin all emerged in Southeast Asia, a relatively low transmission setting, where we would expect a lower parasite effective population size than in Africa 32,34-36, 40 . It has been suggested that the prevalence of counterfeit drugs and frequent use of anti-malarials in Southeast Asia, as well as their earlier introduction, may have led to repeated drug resistance emergence in this region 33,41,42 . There have been several theoretical studies on the evolution of drug resistance, focusing on different aspects of complexities of malaria infections, including chronicity and superinfection 43-50 . Our study is the first stochastic population genetic model that considers the details of the malaria lifecycle (repeated within-host population expansion and between-host transmission bottlenecks), variation in infection length, superinfection, and within-host competition at the same time. Both the probability and speed and fixation per resistant mutation and the number of new or existing drug resistant mutations are key components of the evolution of drug resistance. We focus on the probability and speed of fixation of a beneficial mutation that may or may not have a trade-off, and our results suggest that these may actually be higher in low transmission settings, if we assume the positive correlation between the proportion of chronic infections and transmission intensity. Here, beneficial mutations have the advantage of rapid spread due to the rapid turnover of acute infections (Fig. S3), and within-host selection of chronic patients if the mutation is beneficial within the host.
It is expected that parasites causing acute infections are likely to be under the strongest selection, since they may produce severe disease requiring treatment, but even for beneficial mutations occurring in acute patients, the combination of chronic infections and superinfection dramatically increases the speed of fixation of mutations offering within-host advantage. Furthermore, the proportion of infections that are treated in low transmission settings is also likely to be higher than that in high transmission settings 51 because low exposure of malaria infections is thought to lead to low immunity and higher proportion of clinical symptoms, further increasing selection pressure on the parasite population. Our results therefore provide a mechanistic hypothesis for why the emergence drug resistance has occurred repeatedly in low transmission settings.

Methods
We use a stochastic population genetic framework to model within-host allele frequencies in human and mosquito hosts including transmission events between hosts (Fig. S6). Each host is comprised of a population of parasites, and selection can happen within the host, during the transmission or both. Only infected hosts and mosquitoes are included in the model and resolution of one infection on average results in the appearance of a new infection. The parameters and their baseline values are summarized in Table S1. Patient model. We assume that during an acute infection, the parasite population expands exponentially to the order of 10 11 (details below) and either the patient dies or the parasite population is cleared on day 20 (Fig. 1). For chronic infections, the parasite population initially expands as in acute infections, but then declines precipitously to 10 6 on day 20 and is controlled by the immune system, persisting at this low level for another 20 to 180 days 3,52,53 . In the mixed model with both acute and chronic infections, for simplicity, we set the length of chronic infection to 200 days.
Simulation. Consistent with the erythrocytic cycle of Plasmodium falciparum in the blood, we used 48 hours as the time unit for one iteration. We assume the number of infected human hosts is N = 1000. To reflect an endemic population where individuals are at different stages of infection, we start the simulation assuming an  Fig. S3). Initially, all parasites have wild-type alleles. During each iteration, the parasite population size in each host increases, decreases, or stays the same, depending on the day since infection. Similar to previous work 26 , five selection models are used: within-host advantage (s h + t m 0 ), between-host advantage (s h 0 t m + ), both within-and between-host advantage (s h + t m + ), and trade-off (s h − t m + or s h + t m − ). Selection coefficients within the host and during transmission are denoted by s h and t m , respectively. Results were obtained from at least 10 6 repeat stochastic simulations or 10,000 fixations, whichever was reached first.
Human infections. During infection between days 0 to 18, the parasite population size increases by an average of 16*P*(1 + s h ) fold every other day, i.e. each parasite reproduces X parasites where X is Poisson distributed with mean 16*P*(1 + s h ), P = 0.9 is the probability of death for each parasite 54,55 , and s h is the selection coefficient within the human host. On infection day 20, the parasite population size is on the order of 10 11 (assuming s h = 0, 10*(16*0.9) 9 = 2.66 × 10 11 ) 56 and is composed of mature parasites that can be transmitted to mosquitoes. During transmission events, a bottleneck occurs in which only D = 10 parasites are transmitted to the mosquito 56 .
In order to keep the host population size stable, the number of transmissions is Poisson distributed with the mean equal to dA/r t , where d represents the expected number of human hosts whose parasites are cleared (d = N(1− X)/G 1 + NX/G 2 , where X is the proportion of chronic infections, G 1 and G 2 are duration of infection for acute and chronic infections, respectively), r t is the number of hosts that can transmit parasites, and A = 10 is the ratio of mosquito to human hosts 25 . This implies that the mean number of infections in previously uninfected hosts generated by each infectious host per unit of time decreases with the number of hosts that can transmit parasites. If there is transmission selection (t m ≠ 0), the number of transmissions is Poisson distributed with the mean (1 + t m )*dAC 1 /r t or dAC 1 /r t , depending on whether the hosts contain parasites with the mutation or not. C 1 is the normalization constant that is used for keeping stable host population size , where t mi represents selection coefficient during transmission from human i to mosquito). On day 20, the parasite is cleared in acute infections and the parasite population size decreases to 10 6 in chronic infections 57 .
In chronic infections, during day 22 and onwards, the parasite population size stays at 10 6 in chronic infections and evolves by sampling with replacement from the parasites in the previous iteration like the in Wright-Fisher model 23,24 . Parasites with or without the mutation have the probability of (1 + s h )*C 2 or C 2 to reproduce, where C 2 is the normalizing constant that is used to keep stable parasite population size in the human hosts when there is host selection , where s hj represents selection coefficient of parasite j within the human host). Parasites can be transmitted to mosquitoes during chronic infections at any iteration starting on day 20.

Mosquito infections.
In the mosquito, during infection between days 0 and 10, the parasite population replicates 12 times by on average 2*P fold to reach approximately 10*(2*0.9) 12 = 11568 parasites at the end of the time in the mosquito 55,56,58 . The incubation period in the mosquito is assumed to be 10 days 59,60 , at which point parasites can be transmitted to human host. The bottleneck size during this transmission event is D = 10 55,56,61 . Here, the probability of transmission is the ratio of human to mosquito hosts (1/A). Because malaria parasites spend more time in the life cycle in human host, we choose to focus on mutations that are advantageous within the human host or during the transmission from human to mosquito host in this study and assume that the mutation does not have an effect in the mosquito, or during the transmission between the mosquito and human host.

Mutation and selection.
Our model is a one-locus model. The same as the commonly used infinite-site model in population genetics 62 , we assume that mutation for this locus only occurs once. Mutation can take place in any iteration within patients. When comparing acute and chronic models, mutations occurring on the same day post-infection are used. In the within-host selection model, mutant parasites have a (1 + s h ) fold higher probability of reproducing within human hosts in each asexual generation than wild-type parasites. In the between-host selection model, hosts with mutant parasites have a (1 + t m ) fold higher probability of transmission than hosts with only wild-type parasites. In the model with both within-host and between-host selection, the same mutation has effects on both the probability of reproducing within human hosts and the probability of transmission. We choose to compare the probability of fixation and the time to fixation of a beneficial mutation occurring at day 0 in the human host between models because parasite numbers are smaller in initial days post-infection and mutations have larger frequency in the hosts so that the probability of fixation is higher (Table S2) and there needs smaller number of simulation runs to obtain the estimate of probability of fixation.

Superinfection.
We assume the number of infections is proportional to the duration of infection.
Superinfection can happen at any iteration, but can only contribute to outgoing transmissions if they become mature (reach day 20) by the time of transmission. Therefore, superinfection does not have an effect in the acute model as only the initial infection has time to mature. In the patient with superinfection, when the later infection reaches day 20, allele frequencies of the existing infection and the later infection are mixed in the way that the contribution of the later infection to within-host allele frequency is one-tenth of the contribution of the existing infection. For the case with more than two infections, the existing infection is composed of all previous infections.