Introduction

Mozambique is among the ten countries with the highest burden of malaria worldwide, with an estimated 10.2 million cases in 20211. Malaria transmission is very heterogeneous in the country, with a high burden in the north and very low transmission in the south, therefore requiring different strategies for effective control and potential elimination2. Early treatment of malaria illness with artemisinin-based combination therapies (ACTs) and the use of antimalarial medicines for prophylaxis and prevention remain key to malaria control and, ultimately, malaria elimination. However, resistance to artemisinin3 and partner drugs4, as well as to sulfadoxine-pyrimethamine (SP) used for chemoprevention5, threatens the global effort to reduce the burden of malaria6.

Surveillance of antimalarial efficacy is key to mitigate and manage the risk of resistance to antimalarial drugs4. The identification of molecular markers of antimalarial resistance has led to genetic approaches that can complement therapeutic efficacy studies which follow standardized protocols6,7 to confirm resistance, monitor trends and raise early warning signals6. In the case of artemisinin, partial resistance (delayed parasite clearance) has been linked to mutations in the pfkelch13 propeller region3,6. In the Greater Mekong Subregion, emergence of these mutations has been associated with mutations in  P. falciparum apicoplast ribosomal protein 10 (pfarps10; PF3D7_1460900), ferrodoxin (pffd, PF3D7_1318100), chloroquine resistance transporter (pfcrt ; PF3D7_0709000), and multidrug resistance 2 (pfmdr2; PF3D7_1447900) genes8. Recently, the validated pfkelch13 mutation R561H has been detected in Rwanda9 and Tanzania10, whereas A675V and C469Y have been associated with prolonged parasite clearance half-lives in Uganda11.

The development of resistance to ACT partner drugs continues to pose a challenge in the treatment of malaria4. Increased resistance to piperaquine has been associated with a gene amplification of a section of chromosome 14 involving the genes plasmepsin 2 and 312, as well as with single nucleotide polymorphisms in a putative exonuclease gene (pfexo, PF3D7_1362500) in parasite isolates from Cambodia12. Mutations in the multidrug resistance transporter 1 (pfmdr1) gene (N86Y, Y184F, and D1246Y) have been associated but not fully validated with susceptibility to multiple drugs4,6, including artesunate-amodiaquine and artemether-lumefantrine13. The K76T mutation at pfcrt, together with different sets of mutations at other codons (including C72S, M74I, N75E, A220S, Q271E, N326S, I356T, and R371I) has been linked to chloroquine resistance4,6,14. Finally, clinical treatment failure with SP has been linked to A437G and K540E mutations of dihydropteroate synthase (pfdhps) in combination with triple mutations (N51I + C59R + S108N) in dihydrofolate reductase (pfdhfr)15. Additional pfdhps mutations (S436A/C/F/H and A581G) have been suggested to increase the levels of SP resistance16.

Identifying mutations associated with drug resistance from samples collected on a routine basis can inform drug policies and ensure that interventions utilize appropriate drug regimens. Since replacing chloroquine with a combination of amodiaquine and SP for uncomplicated malaria treatment in 2003, the Mozambique national treatment guidelines underwent various revisions17. In 2006, ACT was formally introduced by adopting artesunate/SP as a first-line treatment for uncomplicated P. falciparum malaria. The most recent change occurred in 2009, when the country introduced artemether-lumefantrine as the official first-line treatment, with artesunate-amodiaquine as a backup in situations when artemether-lumefantrine is contraindicated. Intermittent preventive treatment in pregnancy (IPTp) with SP was first implemented in the country in 2006, and delivered free of charge to all pregnant women18. In 2014, the national guidelines were updated and implemented countrywide to adjust to the ≥3 SP-dose World Health Organization recommendation. In 2015, a national household survey reported an IPTp-SP country coverage of 51.4% for one dose, 34.2% for two doses, and 22.4% for ≥3 doses19. Currently, the country is piloting the use of seasonal (SP and amodiaquine) and perennial (SP) malaria chemoprophylaxis. Several studies have reported the prevalence of molecular markers of antimalarial resistance in Mozambique14,20,21,22,23, but there is no comprehensive analysis of their spatial and temporal distribution in the context of the overall parasite genetic structure. In this study, we used amplicon-based and whole genome sequencing, machine-learning approaches, and relatedness as well as diversity analysis of microhaplotypes flanking pfdhps to describe the spatial and temporal distribution of antimalarial drug resistance markers, the geographic structure of P. falciparum parasites, and the evolutionary history of pfdhps mutant alleles in samples collected in 2015 and 2018 across south, central and north Mozambique.

Results

Sample size and geographic distribution

Among the 2251 P. falciparum samples included in this study, sequencing produced at least one resistance-associated genotype (among 11 genetic markers targeted) in 1784 (79%) samples (455 from 2015 and 1329 from 2018; 308 from North, 440 from Central, and 1034 from South Mozambique; Fig. 1 and Supplementary Tables 13). Among these samples, 1522 were obtained from malaria clinical cases (therapeutic efficacy studies, health facility surveys, or reactive surveillance), 200 from community surveys (mass drug administration, cross-sectional surveys), and 62 from pregnant women at first antenatal care visits (Supplementary Table 1). Whole genome sequences were obtained from a total of 1452 (64%) samples which passed quality filters.

Fig. 1: Source of P. falciparum samples providing genetic data.
figure 1

Tables indicate the number of samples included in the analysis per province and year for each of the three main regions of the country. Provincial borders are indicated with thick lines. The specific districts providing data for the study are colored. Made with QGIS.

Polymorphisms in pfkelch13 gene and artemisinin-resistance predisposing background

Among the 1429 P. falciparum samples successfully genotyped for pfkelch13, 1393 were fully wild-type and 36 (2.5%) presented a total of 32 non-synonymous mutations not associated with artemisinin tolerance (Table 1). A mutation in codon 537 (N537D) was observed in a sample from southern Mozambique (2018). Of the six amino acids making the artemisinin-resistance genetic background, only pfcrt N326Y showed any variation, with five isolates out of 1637 (0.3%) carrying a mixed genotype (Table 2). Similarly, no mutations were observed at codon 415 of pfexo associated with resistance to piperaquine (n = 1394). The plasmepsin2/3 breakpoint was detected in 2 (0.4%) out of 524 P. falciparum isolates (Table 2).

Table 1 Pfkelch13 mutations detected in P. falciparum isolates collected in 2015 and 2018 in seven provinces from Mozambique.
Table 2 Molecular markers of P. falciparum antimalarial resistance observed at frequencies below 5% in Mozambique.

Polymorphisms in pfcrt and pfmdr1

Mutations at codons 72 (n = 1655), 74 (n = 1657), 75 (n = 1658), 76 (n = 1656) in pfcrt, and at codons 86 (n = 1605), and 1246 in pfmdr1 (n = 1519) were absent or below 5% (Table 2). In contrast, 59% (899/1536) of the samples tested carried mutations at codon 184 (534 pure mutants and 365 mixed genotypes; Supplementary Tables 4, 5). No statistically significant difference was observed in the carriage of this mutation between provinces or study periods (Supplementary Fig. 1 and Supplementary Tables 68).

Polymorphisms in pfdhfr and pfdhps genes

Mutations at codons 164 in pfdhfr, and 581 and 613 in pfdhps were either absent or below 1% (Table 2). Mixed genotypes were observed at frequencies of 1–2% for 108, 51, and 59 pfdhfr codons, and 5–11% for 437 and 540 pfdhps codons (Supplementary Table 5). After excluding these mixed genotypes, the overall frequency of mutations in pfdhfr was ≥97% (97% in codon 51 [1596/1638], 98% in codon 59 [1597/1625] and 99% in codon 108 [1635/1649]) and ≥88% in pfdhps (90% in codon 437 [1289/1439] and 88% in codon 540 [1242/1404]; Supplementary Table 6 and Supplementary Fig. 2). The most prevalent pfdhfr and pfdhps alleles were the triple (S108N/N51I/C59R; 99% [1548/1600]) and double mutants (A437G/K540E; 89% [1228/1377]), respectively, with an 87% (1155/1330) of quintuple mutants (Supplementary Table 6). The overall frequency of quintuple mutants increased from 80% [234/293] in 2015 to 89% [921/1037] in 2018 (p < 0.001; Fig. 2a–c, Supplementary Table 7 and Supplementary Data 1), mainly in Cabo Delgado (from 40 to 72%, p < 0.001) and Gaza (from 90 to 100%, p < 0.001). Similar increases were observed for triple pfdhfr and double pfdhps mutants (p < 0.001). The frequency of quintuple mutants increased from north to south, both in 2015 (40% in Cabo Delgado vs 93% in Maputo; p < 0.001) and 2018 (72% in Cabo Delgado vs 95% in Maputo; p < 0.001), mainly driven by differences in pfdhps double mutants (Fig. 2a–c). The multivariable logistic regression analysis showed that both region (north, central and south) and period (2015 and 2018) were independently associated with the relative abundance of pfdhfr/dhps mutations, which increased from north to south and from 2015 to 2018 (Supplementary Table 8).

Fig. 2: Molecular markers of P. falciparum sulfadoxine-pyrimethamine (SP) resistance in Mozambique.
figure 2

Frequency of P. falciparum isolates carrying triple mutations in pfdhfr (a), double mutations in pfdhps (b), and quintuple mutations in pfdhfr/phdhps (c) in 2015 and 2018 in seven provinces from Mozambique. For the pfdhps haplotype 436/437/540 (d), frequencies of the different allelic combinations are shown (n = 1365). Frequencies were calculated after excluding mixed genotypes. Data from Sofala was only available for 2015, and from Inhambane and Zambézia for 2018. The error bars represent a 95% confidence interval for the population proportion.

The distribution of mutations at codon 436 (S436C/A/H/F) in pfdhps had a very marked geographic pattern (Fig. 2d and Supplementary Data 1). After excluding mixed genotypes, mutations at 436 were observed in 17% (40/232; C in 6, F in 5, H in 4, and A in 25) of the isolates obtained from Cabo Delgado, but only in 0.6% (8/1307) of the isolates from the rest of the country, and never in combination with a double 437/540 mutation background (Supplementary Table 9). Therefore, three different pfdhps haplotypes were observed in Cabo Delgado: triple wild-type (S436/A437/K540; 26/203 [13%]), mutant at codon 436 but wild-type in codons 437 and 540 (37/203 [18%]), and wild-type at codon 436 but mutant in codons 437 and 540 (S436/A437G/K540E; 139/203 [68%]). In contrast, the S436/A437G/K540E haplotype was predominant in the rest of the country (1089/1162 [94%]; Supplementary Table 9). No change in the frequency of mutations in codon 436 was observed between study periods (p = 0.371; Supplementary Table 8).

Population structure

A total of 8722 microhaplotype loci were reconstructed via local assembly from 1438 samples which produced whole genome sequences. Of these, 349 samples contained data for less than 50 percent of all microhaplotype loci (Supplementary Fig. 3a) and were therefore excluded. The median expected heterozygosity (He) of the 8722 microhaplotypes from the 1089 samples with data for more than 50% of the microhaplotypes was 0.312 (Interquartile range [IQR]: 0.196–0.498). Twenty-four percent of the microhaplotypes had high expected heterozygosity (He > 0.5) in the parasite population analyzed, and 366 had He >0.75 (Fig. 3a and Supplementary Data 1).

Fig. 3: P. falciparum population structure by geography in Mozambique.
figure 3

Microhaplotypes from regions of 150–300 bp in length between long tandem repeats were reconstructed from whole genome sequences and used to test the geographic structure of P. falciparum parasites. a Distribution of the expected heterozygosity at the 8722 microhaplotype loci extracted from whole genome sequences. The y-axis represents the number of microhaplotype loci for a given expected heterozygosity. The red line marks the 75% percentile of the distribution; the 25% most diverse loci were considered for population structure analysis. b Chromosomal locations of the 155 most important microhaplotypes, which contribute to the geographic (North-Central-South) classification model. c Principal coordinates analysis with samples grouped into regions (North-Central-South; n = 1089), considering microhaplotypes at loci with expected heterozygosity in the top 25% percentile. d Principal coordinates analysis with samples grouped into regions considering the 155 top microhaplotypes, with an out-of-bag error rate of classification of 24.89%. e, f Complexity of infection (COI) for samples in different regions of Mozambique in 2015 (e) and 2018 (f), as indicated by the number of genetically distinct clones. Regional assignment of samples: North: C. Delgado; Central: Sofala, Tete, and Zambézia; South: Gaza, Inhambane, and Maputo.

The 25% most diverse microhaplotype loci (n = 2181) were evaluated as predictors for geographic classification using a random forest analysis at the province and regional levels. The model failed to classify samples at the province level (Out-of-bag error rate = 50.51%). However, the out-of-bag error rate was 24.89% at the regional level (North-Central-South; Fig. 3c, d and Supplementary Table 10). The lowest out-of-bag error rate was observed when classifying samples from North and South (8%), and higher rates when central region samples were considered (15.26% for Central-South and 36.79% for North-Central; Supplementary Fig. 4). Removal of 155 microhaplotypes caused the model to lose accuracy in prediction to the regional classification below the inflexion point of the distribution of mean decrease in accuracy (Supplementary Fig. 3b), and were therefore considered as the most relevant. Thirty-one percent of these microhaplotypes were located in chromosome 6, followed by percentages below 10% in the rest of the chromosomes (Fig. 3b and Supplementary Data 1).

Overall within-host complexity of P. falciparum infections, calculated from the 100 microhaplotypes with the highest He, was 2 (IQR [1,2]) with a prevalence of 47% (517/1090) monogenomic infections. The complexity of infection (COI) and prevalence of monoclonal infections in 2015 was similar in the three regions from Mozambique (p = 0.801 and p = 0.507, respectively). However, median COI in 2018 differed between the three regions (p < 0.001), with the lowest values in the south (1, IQR [1,2]), followed by the center (2, IQR [1,2]) and north (2, IQR [1,3]). Similar trends were observed in the prevalence of monogenomic infections (51% in the south, 46% in the center, and 35% in the north; p = 0.005; Fig. 3e, f, Supplementary Tables 8, 11, and Supplementary Data 1).

Microhaplotypes flanking pfdhps were used to infer the evolutionary history of the mutant alleles in Cabo Delgado, as in the rest of the provinces, double mutants had almost reached fixation (frequencies between 80 and 100%). Sixteen microhaplotypes were contained in a 50 kb region around the gene pfdhps, 15 of them in eight genes and one intergenic (Supplementary Fig. 5). These flanking microhaplotypes separated the parasites carrying the double pfdhps mutant haplotype (always accompanied by a wild-type 436 codon) from the rest of parasites (Fig. 4a, b and Supplementary Data 1). The 50 kb region flanking pfdhps was more similar among double pfdhps mutants (n = 92; median identity by state [IBS] = 0.88, IQR[0.81–0.91]) than among the double wild-type (n = 51, median IBS = 0.68, IQR[0.62–0.76]; p < 0.001; Supplementary Fig. 6). Similarly, He of microhaplotypes flanking pfdhps was 60% lower in the double mutants (median = 0.1, IQR[0.04–0.26]) than in wild-type alleles (median = 0.34, IQR[0.21–0.41]; p = 0.016; Fig. 4c, Supplementary Table 12, and Supplementary Data 1), consistent with recent selection.

Fig. 4: Regional separation, relatedness, and expected heterozygosity of pfdhps allelic haplotypes.
figure 4

Identify by state (IBS) and expected heterozygosity (He) was calculated using 16 microhaplotypes flanking pfdhps to assess the evolutionary history of pfdhps mutant alleles in Mozambique. a Heatmap of the inter-sub-population IBS matrix among dhps alleles in Cabo Delgado observed in 2015 and 2018 (wild-type in codons 436, 437, and 540 [WT/WT/WT]: n = 20; mutant in codon 436 but wild-type in codons 437 and 540 [MUT/WT/WT]: n = 31; wild-type in codon 436 but mutant in codons 437 and 540 [WT/MUT/MUT]: n = 92). Sixteen microhaplotypes in a 50 kb region around pfdhps were used to calculate the pairwise IBS between samples. b t-distributed stochastic neighbor embedding visualization after 10000 iterations and c Expected heterozygosity calculated from the 16 microhaplotype loci in a 50 kb region around pfdhps in parasites collected from Cabo Delgado. Median and interquartile (IQR) He values: 0.1, IQR (0.04–0.26) for double mutants; 0.37, IQR (0.2–0.47) for WT/WT/WT; and 0.28, IQR (0.13–0.4) for MUT/WT/WT. The lower, middle, and upper hinges of the rectangle correspond to the 25% quantile, median, and 75% quantile of the distribution, respectively.

Discussion

This study provides a country-wide resolution of P. falciparum markers of antimalarial resistance and genetic structure in Mozambique which can be used to inform the use of antimalarials for treatment and chemoprevention as well as to study the impact of future interventions. The genomic data provides evidence that: (1) although non-synonymous mutations were observed in pfkelch13, none of them have been associated with artemisinin tolerance; (2) genetic variants associated with piperaquine and chloroquine resistance were rare in 2018; (2) in contrast, the frequency pfdhfr/dhps mutations increased from north to south, almost reaching fixation in Maputo Province; and (3) this spatial trend was accompanied by a reduction towards the south in the genetic complexity of P. falciparum infections and a signal of geographic differentiation which allows a regional separation based on highly diverse microhaplotypes.

The pfkelch13 wild-type, artemisinin-sensitive haplotype predominated in the parasite population surveyed from Mozambique, and none of the artemisinin-resistant validated variants were detected6. However, an array of 32 rare non-synonymous mutations were identified, similar to results from other studies in Africa24. Among them, the N537D mutation, which has been reported as potentially associated with delayed clearance24, was detected in one sample, whereas A578S, the most predominant pfkelch13 mutation in P. falciparum parasites of African origin24 but not associated with artemisinin tolerance, was found in four of the 1429 genotyped samples. The background mutations found to anticipate the emergence of pfkelch13 mutations in South-East Asia8 were not detected in this study. Similarly, there is no strong evidence of polymorphisms associated with resistance to ACT partner drugs. Only 0.4% of the samples analyzed showed evidence of piperaquine resistance, based on the analysis of the breakpoint within the distal end of plasmepsin 325 associated with plasmepsin 2/3 duplications and the single nucleotide polymorphism at codon 415 of the putative exonuclease gene12. This is in agreement with a previous study in Mozambique which found multiple copies of plasmepsin 2 in 1.1% of the samples analyzed21. Mutations in codon 86 of pfmdr1, associated with resistance to amodiaquine and increased susceptibility lumefantrine26,27, were detected in 11 of the 1605 samples analyzed. Fifty-nine percent of the parasites carried the 184 F pfmdr1 variant, although this mutation appears to have a weaker association with antimalarial effectiveness in vivo28,29 and in vitro26. However, pfmdr1 markers must be considered with caution, due to inconsistent associations with ACT partner drug resistance30, pointing out that robust molecular markers associated with amodiaquine and lumefantrine are still lacking.

The present study revealed an evolutionary process acting on the molecular markers of SP resistance. Overall, a high frequency of triple pfdhfr (99%), double pfdhps (89%), and quintuple mutant haplotypes (87%) was observed, which increased from 2015 to 2018 and from north to south. Microhaplotypes in the 50 kb region around pfdhps mutant alleles were more similar (higher IBS) and less diverse (lower expected heterozygosity) than around the wild-type allele, suggesting a recent expansion of the double mutant population in the country31. Geographical heterogeneities in the prevalence of pfdhfr/dhps aleles were accompanied by a different distribution of pfdhps-436 mutation, which was only detected in the north of the country (Cabo Delgado) at a frequency of 17% and never in combination with the double 437 and 540 mutant haplotypes. Changes at codon 436 have been associated with higher levels of in vitro SP resistance32, although in vivo resistance evidences are less clear33. The increase in pfdhps mutations from north to south was also accompanied by a reduction in the number of genetically distinct parasite strains infecting an individual, indicative of declining malaria transmission intensity34. Finally, the pfdhps mutational pattern was also coincident with a regional separation of the parasite population based on highly diverse microhaplotypes, suggestive of geographical structuring. Geographical distance, barriers in gene flow across the regions, and differences in the coverage of antimalarial interventions35 due to unequal distribution of resources and security issues, could have contributed to the microhaplotype regional differentiation, which might have affected the geographic patterns observed in the molecular markers of SP resistance. However, it is still unclear what selective forces have fueled the spread of pfdhfr/dhps mutants in the absence of its large-scale use, as Mozambique abandoned SP for clinical management in 2009. Compensatory mechanisms that reduce the mutation fitness cost36 or an insufficient pool of sensitive parasites to fuel recovery37, may have contributed to the increase in pfdhfr/pfdhps mutants in the absence of SP drug pressure. However, these were not limiting factors for the recovery of chloroquine sensitivity in Mozambique where the mutations in pfcrt were almost fixated38. The contribution to drug pressure of SP for IPTp or other drugs such as cotrimoxazole39,40 is likely to be small, as the populations targeted make up only a small part of the overall population of Mozambique at a given time. Finally, lower levels of antimalarial immunity and sexual recombination to unlink resistance haplotypes may have also contributed to the higher carriage of molecular markers of SP resistance in southern Mozambique, where malaria transmission is the lowest41. Although these patterns seem consistent with directional selection due to drug pressure, the nature of the study does not allow to discard other factors, such as the regional impact of neighboring countries’ drug policies42.

The results of this study have several public health implications. First, all available in vivo efficacy data7,43,44 and the lack of validated kelch13 mutations in 2018 suggest the appropriate efficacy of artemisinin for P. falciparum treatment and reduction of malaria transmission in Mozambique. However, the broad array of rare non-synonymous mutations that were detected could potentially provide a deep reservoir of variations for the emergence of artemisinin tolerance45, as has been recently reported in Rwanda and Uganda9,11. Second, the lack of mutations at codon 415 of pfexo and the low prevalence of plasmepsin2/3 breakpoints detected (0.4%) suggests the appropriate performance of piperaquine as an ACT partners drug. However, plasmpesin2/3 gene amplification should be closely monitored, given the rapid emergence and spread of piperaquine resistance in Southeast Asia, resulting in high treatment failure rates after dihydroartemisinin-piperaquine treatment12. Third, the data reassures the use of SP for chemoprevention in spite of the high carriage of pfdhps and pfdhfr quintuple mutants, due to the lack of evidence of a relationship between molecular markers and chemopreventive efficacy5,46. Moreover, the pfdhps A581G mutation, which has been suggested to reduce the SP chemopreventive effectiveness in infants and pregnant women47,48,49, was detected only in three of the 1490 (0.2%) samples that were analyzed, therefore supporting the continued use of SP for IPTp in Mozambique. Similarly, amodiaquine effectiveness is likely to remain acceptably high for seasonal malaria chemoprevention because of the very low prevalence of 72–76 mutations in pfcrt and the 86Y–184Y haplotype of pfmdr1, suggested to be necessary for clinically relevant resistance to amodiaquine in Africa50,51. Fourth, the very low prevalence (0.6%) of markers of chloroquine resistance in pfcrt and the evidence of a return of its therapeutic efficacy in Mozambique38, together with its chemoprophylactic activity and safety profile, suggest that chloroquine could play a role, on its own or in combination with other drugs or tools, for chemoprevention at the population level or for currently unprotected populations such as first trimester pregnant women52. Fifth, microhaplotypes indicative of regional population structure may be useful for identifying broad-scale parasite flow or assignment of geographic origin in Mozambique. However, the lack of a finer population structure at the province level may impose some limitations on this approach, which might be better evaluated by taking advantage of differences in pairwise relatedness of parasites as opposed to regional differentiation in allele frequencies53,54. These highly diverse microhaplotyes may also be useful to develop parasite genetic diversity metrics for detecting changes in transmission intensity and monitor the effectiveness of antimalarial interventions34. Sixth, this study also shows the utility of secondary analysis of blood samples from other studies to describe molecular patterns of surveillance interest. And finally, heterogeneity in molecular markers of antimalarial resistance within Mozambique highlights the need to use caution when extrapolating survey results from a single location.

The study has several limitations. First, the dried blood spots that were evaluated represent a convenience sample obtained from different studies, resulting in heterogeneities in the age, the clinical status of individuals, and intensities of transmission that could affect the level of immunity and intake of treatment. While further work is required to quantify the impact of these factors, these heterogeneities represent those that can be found in a country such as Mozambique, where malaria transmission ranges from a very high burden in the north and very low in the south. Second, the exclusion of mixed wild-type-resistant infections to calculate the frequency of resistance haplotypes may bias resistance estimates55. Third, selective forces other than those driven by the use of antimalarials cannot be discarded, as the signals of recent selection were inferred using microhaplotypes in the 50 kb region around pfdhps which overlap coding regions of other genes. Finally, caution is required when inferring the treatment and chemopreventive efficacy of an antimalarial, which depend on factors other than intrinsic parasite susceptibility, such as patient-acquired immunity, initial parasite biomass, treatment adherence, dosing, drug quality, and pharmacokinetics6. However, information on molecular markers plays an important role in tracking resistance and should be leveraged to detect early warning signals56. Combining chemoprevention efficacy studies57 with the monitoring of pfdhps mutations is required to evaluate SP-based chemopreventive strategies.

In conclusion, this report shows north-south genetic signals of increasing molecular makers of SP resistance, decreasing genetic complexity of infections, and geographic differentiation in Mozambique. However, the very low prevalence of 581 mutations in pfdhps reassures the role of SP for chemoprevention in Mozambique. Similarly, no molecular signals of artemisinin tolerance were observed in Mozambique. These results provide baseline data for studying the evolution of P. falciparum parasites in response to changing national malaria treatment guidelines. Moreover, these findings prompt the integration of molecular surveillance systems with treatment and chemoprevention efficacy studies to track the emergence and expansion of drug resistance in Mozambique. To achieve this, addressing inefficiencies in sampling and sequencing efforts, together with financial support and appropriate use of the data generated, is required to ensure the sustainability of malaria molecular surveillance programs58.

Methods

Study site and sample collection

This study analyzed 2251 samples collected in 2015 (n = 724) and 2018 (n = 1527) from 40 districts in seven provinces from Mozambique (Supplementary Tables 1, 2): one in the north region (Cabo Delgado), three in the center region (Zambézia, Sofala, and Tete), and three in the south (Gaza, Inhambane, and Maputo; Fig. 1). Dried blood spots were collected from P. falciparum-infected individuals identified during six malaria observational studies and clinical trials conducted in 2015 and 20187,43,59,60,61. In 2018, two health facility survey studies recruited individuals attending outpatient services in Maputo, Zambézia, Cabo Delgado, Inhambane, and Gaza (all ages)59,60. Samples from an additional two therapeutic efficacy studies included children less than 5 years of age with confirmed malaria (by rapid diagnostic tests) in Cabo Delgado, Tete, Sofala, and Gaza province in 201543 and Cabo Delgado, Tete, Zambézia, and Inhambane in 20187). In the fifth study, all age individuals with a malaria-positive rapid diagnostic test were identified through community-based cross-sectional surveys in Maputo Province (2015 and 2018)61, including a malaria elimination project area61, which collected samples from individuals participating in mass drug administration campaigns and reactive surveillance in Magude District. Finally, in the sixth study, pregnant women at their first antenatal care visit with a P. falciparum infection confirmed by quantitative real-time PCR were identified through antenatal care surveys conducted in Maputo Province (2018)62. Health facility-based sampling sites were district or subdistrict health centers or provincial hospitals, selected by the Centro de Investigação em Saúde de Manhiça (CISM) or National Malaria Control Program according to their public health or research needs, whereas cross-sectional surveys were community-based and participants were randomly selected. Further information on sampling for each study is available in the associated publications. Before administering treatment, 50 μL dried blood spots on filter paper were obtained from each patient through finger-prick, identified with anonymous barcodes, and stored at 4 °C with silica gel.

Inclusion and ethics

Clinical-demographic data and blood samples were collected only after written informed consent and assent from all participants, or an accompanying adult, if younger than 18 years of age, was provided. All study protocols were approved by the Mozambican National Committee for Bioethics in Health. The research included local researchers throughout the research process, including the study design, study implementation, data ownership, intellectual property, and authorship of the publication. The research is locally relevant, as determined in collaboration with local partners, who agreed on the importance of malaria molecular surveillance. Roles and responsibilities were agreed amongst collaborators before implementing the research activities. Special emphasis has been allocated to capacity-building for local researchers on genomic and bioinformatic tools for molecular surveillance.

Amplicon-based sequencing

DNA was extracted from samples at the MalariaGEN Laboratory at the Wellcome Sanger Institute, Hinxton, UK, using high-throughput robotic equipment (Qiagen QIAsymphony)63. Parasite DNA was amplified by applying selective whole genome amplification and genotyping was performed by the SpotMalaria platform63. Briefly, a first PCR was done to generate 190–250 bp amplicons of interest in the parasite genome using locus-specific multiplexed primers, followed by a second PCR to incorporate unique sample-level and primer-pool multiplexing adapters. After sequencing multiple samples on a single MiSeq lane, sequences were de-plexed using the unique multiplexing adapter IDs and aligned to a modified amplicon P. falciparum reference genome. Genotypes were called for each variant analyzed using bcftools and custom scripts63, namely: pfkelch13 (any mutation in codons 349–726 corresponding to BTB/POZ and propeller domains)64, pfdhfr (codons 51, 59, 108, 164)15, pfdhps (codons 436, 437, 540, 581, 613)16, pfcrt (codons 72, 73, 74, 75, 76)4,6,14, pfexo (codon 415)12, pfmdr1 (codons 86, 184, 1246)13, and artemisinin-resistance genetic background (codons 127, 128 in pfarps10, 193 in pffd, 326, 356 in pfcrt and 484 in pfmdr2)8. An assay designed to detect the breakpoint within the distal end of plasmepsin 3 that includes the complete duplication of the plasmepsin 2 gene (plasmepsin2/3 breakpoint) was used to detect the hybrid sequence created as a result of the plasmepsin 2/3 duplication25.

Whole genome sequencing

P. falciparum samples were also whole genome sequenced at Wellcome Sanger Institute and the University of California, San Francisco. In brief, short sequence reads (200 bp) were generated on the Illumina HiSeqX platform at Wellcome Sanger Institute65. At the University of California, San Francisco, barcoded libraries prepared using the NEBNext Ultra II DNA Library Prep Kit after selective whole genome amplification66 were pooled and sequenced on the Illumina NovaSeq 6000 System using 150 bp paired-end sequencing. Reads were filtered for a minimum per base quality of 20. Variant calls were generated by running a custom pileup program and filtered to have a minimum read depth of 10 and a minimum within-sample frequency of 5%, which were generated by utilizing selective whole genome amplification control runs on known lab strains to remove all false variant calls.

Data analysis

The analysis aimed at describing the spatial and temporal distribution of antimalarial drug resistance markers, the geographic structure of P. falciparum parasites and the evolutionary history of pfdhps mutant alleles. The frequency of infections carrying parasites with markers of antimalarial resistance were estimated at the province level, based on sampling location and year. For each codon, samples were classified as wild-type, mutant, or mixed if both wild-type and mutant alleles were detected or missing if samples failed to produce a valid genotype. Samples with no mixed genotypes were retained for the haplotype reconstruction and downstream statistical analysis. A local haplotype reconstruction tool (Pathweaver67) was used to extract microhaplotypes from regions of 150–300 bp in length between long tandem repeats. Microhaplotypes were selected to contain no homopolymers/dinucleotide repeats longer than 10 bp or length variation >3 bp, with at least two single nucleotide polymorphisms60. Samples with greater than 50% of microhaplotype loci missing were excluded from subsequent analyses.

Expected heterozygosity (He) at a locus was calculated using custom R code as \({{{{{{\rm{H}}}}}}}_{{{{{{\rm{e}}}}}}}=[\frac{n}{n-1}][1-\sum {p}^{2}]\) (equation 1), where n is the number of samples, and p is the allele frequency of each microhaplotype allele at the locus. The variance of He was calculated according to the formula: \(2(n-1)/{n}^{3}\{2(n-2)[\sum {\left(\right.{p}^{3}-(\sum {p}^{2})}^{2}]\}\) (equation 2). The within-host complexity of P. falciparum infections was calculated using a Markov chain Monte Carlo approach from the 100 microhaplotypes with the highest He (R package MOIRE, https://github.com/EPPIcenter/moire). Monogenomic infections were considered when the complexity of infection was one. The geographic structure was tested using Random Forest classification68 (R package randomForest, with ntree = 2500) on microhaplotypes with He in the top 25 percentile as predictors and the geographical location (province and region) as the outcome. Balanced training datasets (representing 75% of the data) were used for initial testing, and test datasets (the remaining 25% of data) were used to calculate the out-of-bag error rate of the classification model. An out-of-bag error rate lower than 25% was considered as a reasonably good classification. Visualization of the classification was performed by a Principal Coordinates Analysis of the proximity matrix. Microhaplotypes in the 50 kb region flanking pfdhps were used to infer the evolutionary histories of mutant alleles. Population structure was visualized using t-distributed stochastic neighbor embedding, which considers the presence/absence of microhaplotype calculated with the R package Rtsne with 10000 iterations. Between sample relatedness was assessed through pairwise IBS calculated as \(\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}{Si}/{XiYi}\) (equation 3), where n is the number of loci, Si is the number of microhaplotype alleles shared by the samples at locus i, and Xi and Yi are the number of microhaplotype alleles at locus i of samples X and Y respectively. Chi-square test and logistic regression models were used to compare frequencies of resistance markers between regions and study periods. Differences in He between pfdhps haplotypes at individual microhaplotype loci was tested through a permutation test which randomly shuffled the labels of the sub-populations 1000 times at each microhaplotype locus69. Kruskal–Wallis rank sum test was used for the comparison of the distribution of He and IBS between populations, with Dunn’s test and Bonferroni correction for multiple testing in pairwise comparisons.

Statistics and reproducibility

This study analyzed 2251 samples collected from 40 districts in seven provinces from Mozambique. Among these, sequencing produced at least one resistance-associated genotype in 1784 samples, which were included for statistical analysis. Whole genome sequences were obtained from a total of 1452 samples which passed quality filters. Statistical analyses were performed in Stata version 15.0 and R version 4.1.2. All reported p values are two-sided, and a p value of less than 0.05 was considered to indicate statistical significance.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.