Malaria is caused by unicellular eukaryotic Plasmodium parasites, and P. falciparum is responsible for most malaria morbidity and mortality. Despite significant progress over the past 20 years, resistance of mosquito vectors to pyrethroid1 and the emergence of multidrug-resistant parasite strains2 emphasize the need for new tools, including vaccines, to combat the disease. The most advanced malaria vaccine candidate to date is the RTS,S/AS01 vaccine, which has completed phase 3 clinical trials in young African children and is currently undergoing a large scale pilot introduction in Malawi, Ghana, and Kenya to inform on a policy decision for broader use3. The vaccine is based on PfCSP, which densely covers the surface of sporozoites and plays a critical role in the P. falciparum life cycle from the development of sporozoites in the mosquito midgut to liver-stage development in humans4,5,6,7. The N-terminal domain of CSP includes a heparan sulfate binding site for hepatocyte adhesion4, followed by the immunodominant central repeat region8, and the C-terminal α-thrombospondin repeat (αTSR) domain that contains multiple T-cell epitopes9. The repeat region in P. falciparum is composed of 1 NPDP, 3-5 NVDP, and 35-41 NANP repeats10,11,12,13. In contrast, RTS,S contains only 19 NANP repeats and the αTSR domain, linked to the hepatitis B surface antigen protein (HBsAg), and was expressed recombinantly with soluble HBsAg to form a virus-like particle that is administered with the AS01 adjuvant14. RTS,S displayed ~40 % efficacy against clinical malaria disease over 4 years of follow-up in phase 3 clinical testing; vaccine efficacy is highest in the period immediately following immunization and declines coincident with decay of induced Ab titers to CSP15,16,17,18. A similar vaccine candidate, R21, composed only of the same HBsAg-CSP fusion (i.e., without extra HBsAg), and formulated with Matrix-M adjuvant, has recently entered phase 2 clinical testing and is showing comparable efficacy levels in early clinical studies19. Another candidate is the attenuated, whole-sporozoite-based PfSPZ vaccine20, delivered by direct venous inoculation. It primarily aims to induce cellular immunity, and is thus associated with lower anti-CSP antibody titers compared to R21 and RTS,S; however, vaccine efficacy in endemic field studies has been modest21. These clinical studies highlight the need to improve current vaccines to induce either more durable protection and/or higher potency antibody responses.

Recently, many anti-CSP antibodies have been characterized using structural and biophysical approaches and various functional assays, which have contributed to our growing insights into humoral immune responses against CSP. The junctional region, which corresponds to the amino-acid sequence of PfCSP between the N-term domain and the NANP repeats, contains NPDP and NVDP motifs and has been shown as a target for potent antibody responses22,23. However, recent evidence suggests that anti-junction mAbs can also cross-react with the NANP repeats, and their protective capacity can be correlated with their binding promiscuity to NANP24,25. Epitopes of anti-NANP antibodies typically contain 2-3 NPNA structural motifs, which can adopt local conformations of a type I β-turn and an Asn pseudo 310 turn22,23,26,27,28,29,30. Anti-NANP antibodies often utilize a germline-encoded Trp residue to interact with the Pro or Asn in the NPNA turns29. Additionally, some of these Abs exhibit unusual homotypic inter-Fab contacts when they simultaneously recognize the adjacent repeating epitopes on the NANP region of CSP26,27. Despite these structural and biophysical advances, the implications for antibody and immune responses, especially which properties or structural features correlate with in vivo protection by anti-NANP antibodies, are still not resolved.

In this work, we characterize eight monoclonal antibodies (mAb) derived from protected volunteers who participated in a phase 2a clinical trial of RTS,S/AS0131 with a delayed fractional dosing regimen, and compare them with three previously published mAbs derived from the same clinical trial28,29. Our data suggest a correlation between antibody apparent affinity (driven by the off-rate) and in vivo protection, which could serve as an important basis for subsequent characterization and engineering of anti-NANP mAbs. Two antibodies exhibit homotypic Fab–Fab interactions, which increase avidity to the repeat peptides as found in previous studies, and may have implications for Ab responses against CSP as an unusual type of antigen26,27. Co-crystal structures also reveal conserved and convergent use of aromatic residues for interaction with the NANP repeat region. Furthermore, we observe that binding to NANP repeats with an extended conformation, which lacks any secondary structural motifs, may not be optimal for stable antibody interaction and could contribute to low affinity to CSP and, subsequently, poor protection. On the other hand, potent and high-affinity antibodies all recognize epitopes containing a type I β-turn and/or Asn pseudo 310 turn. These two structural motifs are observed as the repeating unit on a soluble recombinant shorter version of CSP (rsCSP) that can then adopt an unusual, long-range, extended spiral conformation in the presence of such antibodies27. Altogether, these comprehensive characterizations of anti-NANP antibodies enhance our understanding of human humoral immune responses against CSP and provide a strong foundation for the design of next-generation malaria vaccines.


Many anti-NANP antibodies show potent functional activity in vivo

The sequences of the antibodies in this study were originally derived from samples from protected volunteers in a phase 2a trial of RTS,S/AS01 seven days after the third fractional dose, as previously reported27,28,29. Antibodies were selected for study here from among the expanded sequence families with a focus on prevalent IGHV families. The eight monoclonal antibodies (mAbs) chosen for study were encoded by three different immunoglobulin heavy variable (IGHV) genes and recombinantly expressed. These mAbs were compared to previous mAbs 311, 317, and 397 from this set that included two other IGHV genes27,28,29. Experiments to evaluate in vivo protection were conducted for two panels of antibodies using two mouse models that assess parasite liver burden load and blood-stage parasitemia32,33. The first panel contained mAbs derived from the IGHV3-33 gene (mAb239, 337, 356, 364, and 395), and the second consisted of mAbs encoded by IGHV3-49 (mAbs 224, 399), IGHV1-2 (mAb366), IGHV3-15 (mAb397), and IGHV3-30 (mAb317) germline genes, with mAb311 (from IGHV3-33) as a control across both experiments. To assess reduction of parasite liver burden load, mice (N = 5) received 100 µg mAb by intravenous injection (IV), which is a dose established previously for standard comparison of anti-NANP antibodies32,33, and after 16 h, mice were challenged with chimeric sporozoites (P. berghei sporozoites expressing full-length P. falciparum CSP) (Fig. 1a). All antibodies significantly reduced parasite burden (except for mAb395 (p < 0.05); p < 0.01 for other mAbs, Mann-Whitney U test) with most mAbs inhibiting parasite development by at least 93% (Fig. 1a). However, in the first panel, mAb395 performed significantly worse and mAb337 showed less in vivo protection compared to the others (p < 0.01, Mann-Whitney U test) (Supplementary Table 1). In the second panel, mAbs 366 and 397 also displayed weaker protection, which was significantly less than mAbs 224 and 399 (p < 0.05, Mann-Whitney U test) (Supplementary Table 1).

Fig. 1: Assessment of antibody in vivo protection.
figure 1

a Parasite liver burden load was measured by bioluminescence of P. berghei sporozoites expressing luciferase-conjugated PfCSP after passive transfer of 100 µg of antibody in C57Bl/6 mice (N = 5). The numbers indicate the percent inhibition of the mean parasite burden relative to that of naïve control mice (i.e., % inhibition). A two-sided Mann-Whitney U test was used; *p = 0.0159, and **p = 0.0079. The bar denotes the geometric mean, and error bars denote the SD. Red dashed lines indicate the baseline signal from naïve non-infected mice treated with D-luciferin as established previously32. Parasite-free mice after passive immunization with b 300 µg or c 100 µg of the indicated antibodies before challenge with bites of infected mosquitoes. A log-rank test was used; in b *p = 0.032 (mAb395), **p = 0.0006 (mAb239, 356, and 364), 0.0005 (mAb311; second panel, and mAb317), 0.0013 (mAb399 and 224), 0.0022 (mAb311, first panel), 0.0053 (mAb337), and 0.0089 (mAb366 and 397); in (c) *p = 0.019 (mAb399 and 366), **p = 0.0051 (mAb239, 356, 364, 224, and 317) (N = 6 per group). Source data are provided as a Source Data file.

To further validate the liver burden results, mAbs (100 µg or 300 µg) were passively transferred by IV to mice (N = 6) 16 h before exposure to bites of infected mosquitoes as in previous studies33 (Fig. 1b, c). At 300 µg mAbs, all antibodies from both panels protected at least 50% of the mice from blood-stage infection compared to the naïve control group (p < 0.05, log-rank test). Nonetheless, all mAbs, except for those that confer sterile protection, have overlapping confidence intervals (Supplementary Fig. 1), indicating insufficient statistical power (N = 6) to distinguish between these mAbs (Fig. 1b). For 100 µg mAbs, only mAb239, 356 and 364 from the first panel, and mAb224, 399, 366, and 317 from the second-panel exhibit protection that was significantly greater than the naïve control mice (p < 0.05, log-rank test) (Fig. 1c).

Antibody apparent affinity increases with homotypic interactions

Antibody binding affinities were measured using isothermal titration calorimetry (ITC) against both short and long NANP repeat peptides to capture potential increases in apparent affinity due to avidity effects through homotypic Fab–Fab interactions, as observed previously for certain Fabs26,27. Despite conventional use of the term “NANP” repeats, our previous studies have consistently shown that anti-NANP antibody epitopes are typically composed of two to three “NPNA” structural motifs27,28,29. Therefore, the binding of all mAbs was measured against NPNA2, NPNA4, and NPNA6 peptides.

While some Fabs had strong binding (near the nM range) against short NPNA peptides, most Fabs tested here started from low affinity (µM range) that was dramatically increased against longer NPNA peptides, as indicated by the fold-change of the dissociation constants (Kd) (Table 1, Supplementary Figs. 2,3, and Supplementary Table 2). For example, mAb356 recognized NPNA2 with a µM Kd but bound NPNA6 in the nM range, resulting in an ~300-fold Kd change (Table 1). Overall, most mAbs bound strongly to the longer NPNA peptides, except for mAb395 and 366, whose affinity remained well within the µM range (Table 1). Multiple copies of Fab311, with an 8-fold mean Kd change between NPNA2 and NPNA6, simultaneously recognized rsCSP (NPDP/NVDP/NANP repeat ratio of 1/3/19 instead of 1/4/38 for the P. falciparum 3D7 strain) and exhibited homotypic inter-Fab contacts27. Therefore, mAbs 239, 337, 356, and 399 with higher fold-changes should also exhibit Fab–Fab interactions on binding multiple epitopes on the CSP NANP repeat region (Table 1).

Table 1 Dissociation constants of antibody Fabs and fold-changes in apparent affinity obtained from ITC.

The presence of homotypic Fab–Fab interactions in mAbs 239 and 399 was validated in the crystal structures of Fab239–NPNA4 and Fab399–NPNA6 complexes (Fig. 2, Supplementary Table 3). Two copies of Fab239 form side-to-side inter-Fab contacts, which are mediated mostly by the heavy chain (Fig. 2a–c). These Fab–Fab interactions are asymmetric where complementarity-determining region (CDR) H1 and H3 in one Fab interact mainly with CDR H2 and L3 in the adjacent Fab (Fig. 2b, c, g), which is strikingly similar to what is observed in the cryo-EM structure of Fab311 with rsCSP27. The interaction of HAsp31 (Asp residue 31 from the heavy chain) from CDR H1 with HGlu64 from CDR H2 is also conserved in the Fab–Fab interfaces of Fabs 239 and 311 (Fig. 2c). Therefore, these similarities suggest that Fab239 may also be capable of forming a long-range spiral structure, in which multiple copies of the Fab simultaneously bind to rsCSP27. Fab399 exhibits a different type of Fab–Fab interaction with a head-to-head configuration, resembling that of Fab121026 (Fig. 2d, e), but the inter-Fab contacts here are perfectly symmetric, more extensive, and mediated almost entirely by the heavy chain. HAsp31 in CDR H1 now forms a different network of hydrogen bonds and a salt bridge with HTyr32 and HArg94 in the adjacent Fab (Fig. 2f, g). Unlike Fabs 239 or 311, Fab399 is not likely to form a similar spiral conformation with rsCSP27 because such a structure would not accommodate the symmetric, head-to-head, inter-Fab interactions. Despite the different CDR contributions to the homotypic interactions of Fabs 239 and 399, both antibodies similarly use CDR H1, H2, H3, and L3 for interaction with the NANP peptide (Supplementary Fig. 4).

Fig. 2: Homotypic Fab–Fab interactions in Fab239- and Fab399-peptide complexes.
figure 2

a Two copies of Fab239 (Fab A, black: heavy chain, white: light chain. Fab B, dark brown: heavy chain, tan: light chain) in the crystal structure are shown as surfaces with the NPNA4 peptide represented as a yellow tube. b, c Interactions between two Fabs (A and B) that simultaneously recognize an NPNA4 peptide. Hydrogen bonds (yellow dashes) and salt bridges (black dashes) are highlighted. The Fabs are shown as cartoon representations with the side chains of interacting residues represented as sticks. CDRs as defined by Kabat are colored green, blue, magenta, light green, light cyan, and pink for CDR H1, H2, H3, L1, L2, and L3, respectively. d Surface representation of homotypic, head-to-head interactions of two copies of Fab399 in the crystal structure with NPNA6 peptide (yellow tube). e, f Contacts between two Fabs that simultaneously recognize the NPNA6 peptide (transparent yellow tube). All coloring schemes and representations are as for the Fab239-NPNA4 complex. g Individual residue contributions to the BSA of inter-Fab interactions are shown in a bar plot for the heavy and light chains of Fab239 and Fab399. The yellow and blue bars indicate the BSA on Fab A and Fab B (defined as in previous panels), respectively, while the green bars show the overlap of both. The CDRs are colored as in previous panels. Additionally, the alignment between the Fab heavy/light chain sequences and germline IGHV and IGKV gene sequences is shown to display somatically mutated residues. “H” and “S” mark residues that are engaged in hydrogen bonds and salt bridges, respectively. Antibody amino acid residues are labeled with a superscript “H” for heavy chain and “L” for light chain. Residue numbers are also shown as superscripts.

Antibody in vivo protection correlates with Fab apparent affinity

To approximate the binding affinity and kinetics of these Fabs to full-length CSP, bio-layer interferometry (BLI) was performed using rsCSP (Table 2, Supplementary Fig. 5). The overall Kd values with BLI were similar to those obtained from ITC, with mAbs 395 and 366 displaying the lowest affinity well within the µM range (Tables 1, 2). Most mAbs with high apparent affinity exhibited potent functional activity in the mouse model of parasite liver burden load, except for mAbs 337 and 397, which performed worse than expected. The low-affinity mAbs 366 and 395 performed less well, as anticipated (Table 2). Although only 11 antibodies were examined in this study, the percent inhibition of parasite load, normalized across the two antibody panels, exhibited a correlation with Fab apparent affinity to NPNA6 peptide by ITC and to rsCSP by BLI with R2 of 0.8435 and 0.9257, respectively (Table 2 and Fig. 3). For cross-validation of the linear regression, bootstrapping was performed to generate 1000 models with an average R2 (R2-boot) of 0.6819 and 0.7763 for apparent affinity against both NPNA6 peptide and rsCSP vs. normalized percent inhibition, respectively, which was lower than the R2 of the original models but still suggested a linear correlation (Fig. 3). In contrast, the affinity against the short NPNA2 peptide did not correlate with the inhibition of parasite liver burden load, nor did the mean Kd fold-change from ITC (Supplementary Fig. 6 and Table 1). All antibodies had equivalent association rate constants (kon) in the order of 104 M−1s−1 that did not correlate with in vivo protection from the liver burden assay (Table 2 and Fig. 3). On the other hand, the most potent blocking antibodies displayed comparable dissociation rate constants (koff) in the range of 10−3 s−1, whereas the less potent mAbs 366 and 395 had faster koff of ~10−2 and ~10−1 s−1, respectively (Table 2). Thus, the koff component of the Kd correlated well with the normalized percent inhibition from the liver burden assay (R2 = 0.9230, R2-boot = 0.7633) (Table 2 and Fig. 3). Antibody thermal stability (Tm) was also examined but showed no correlation with antibody activity as measured by liver burden load (Table 2, and Supplementary Fig. 6). Thus, the outlier mAbs 337 and 397 with high affinity but poor protection could result from poor pharmacokinetics, e.g., durability and clearance in mice, aggregation in mouse serum, or cross-reactivity with mouse antigens etc.

Table 2 Dissociation and rate constants of antibody Fabs to rsCSP obtained from bio-layer interferometry (BLI) displayed with mean dissociation constants measured to the longest peptide tested by isothermal titration calorimetry (ITC), mean % inhibition of parasite burden studies, % blood-stage parasite free mice from parasitemia studies, total paratope buried surface area (BSA), number of hydrogen bonds between paratope and epitope, and melting temperature (Tm) from differential titration calorimetry.
Fig. 3: Correlation of apparent affinity and dissociation constants with normalized parasite burden.
figure 3

The linear regression graph plots the % inhibition of the parasite liver burden load, normalized across the two mAb panels, against: a Fab apparent affinity as the dissociation constant (Kd) measured to the NPNA6 peptide using isothermal titration calorimetry; (b) Kd, (c) association rate constant (kon), and (d) dissociation rate constant (koff) measured against rsCSP using bio-layer interferometry. The dashed line indicates the fitted linear regression model with 95% confidence interval shaded in grey. The R2 value for each model is displayed, together with the average R2 from 1000 models from bootstrapping (R2-boot). Data points for each mAb are colored as shown. Source data are provided as a Source Data file.

Crystal structures reveal conserved interaction motifs

To gain a better understanding of the molecular recognition of the antibodies to the NANP repeats and what structural features might relate to low binding affinity and poor protection, we co-crystalized Fabs 239, 356, 364, 395, 224, 399, and 366 with repeat peptides (NPNA2,3,4, or 6), and compared with our previous crystal structures of Fabs 311, 317, and 39728,29. Except for Fab395, the epitopes of the IGHV3-33 Fabs share a similar conformation that includes a type I β-turn and an Asn pseudo 310 turn formed by the first and the second NPNA motifs, respectively (Fig. 4a, Supplementary Table 4). Although the Fab395 epitope is shorter (1NPNANP6), the first NPNA motif also adopts a type I β-turn (Fig. 4b). The NANP peptides in the IGHV3-33 Fabs interact mainly with the heavy chain as indicated by its dominant contribution to the buried surface area (BSA); the light-chain interactions are mostly mediated through CDR L3 (Fig. 4a, b, and Supplementary Table 5). The IGHV3-33 Fabs all have similar conformations of CDR H1 and H2, which contribute to conserved interactions with their epitopes, with Fab395 having a unique disulfide bond between HCys29 in the framework region and HCys32 in CDR H1 (Kabat nomenclature) (Fig. 4c, and Supplementary Table 6). Notably, the main chain of residues in CDR H1 form a conserved hydrogen bonding network with the NANP peptide. Conservation of van der Waals interactions are mediated by aromatic residues 32/33 in CDR H1, and 52 and 58 in CDR H2 (Fig. 4c, and Supplementary Table 6). While Tyr in position 32/33 and 58 can evolve to other aromatic residues, HTrp52 is strictly conserved in our antibody panel. Two of these residues, 52 and 58, interact specifically with Pro in the Asn pseudo 310 turn and type I β-turn respectively (Fig. 4c, and Supplementary Table 6). Although Fab395 displays a seemingly different epitope compared to other IGHV3-33 Fabs in this panel, the type I β-turn in the Fab395 epitope is positioned in the pocket where the Asn pseudo 310 turn is accommodated in other Fabs, such as Fab311 (Fig. 4b). The Pro in the type I β-turn now interacts with HTrp52 and the following Pro contacts LTyr98. The presence of the bulky LTyr94 in CDR L3 of Fab395 may prevent binding of the repeat peptide to the usual pocket for the type I β-turn as in other IGHV3-33 Fabs (Fig. 4b). Consequently, the binding pocket of Fab395 is considerably smaller and may contribute to its low binding affinity and poor protection.

Fig. 4: Crystal structure of IGHV3-33 Fabs.
figure 4

a Structures of Fab239, 311, 356, and 364 in complex with NPNA2 peptide (yellow) are shown in cartoon representation and aligned based on CDR H2. Only the CDRs involving in peptide binding are shown with CDR H1 and H2 colored in grey and CDR H3 and L3 colored for the different antibodies as indicated. A schematic of the epitope structural motifs is also indicated below. Close-up views of the paratopes are also displayed with the Fabs as cartoons embedded in their surface representation. CDR H1, H2, H3, and L3 are colored green, blue, magenta, and pink, respectively, and the peptides are shown as yellow tubes with side chains in stick representation. Antibody side chains engaging in hydrogen bonds (orange dashes) and key interacting aromatic residues are also shown as sticks. b The paratopes of Fab395 aligned to that of Fab311 based on CDR H2 (grey) are displayed as cartoons with their CDRs colored as shown, with the schematic of the Fab395 epitope structural motif also indicated below. The side chains of HTrp52, HTyr98, and LTyr94, HCys29, and HCys32 are highlighted as sticks (also with a surface representation for LTyr94). The peptides are shown as tubes with side chains as sticks and colored as indicated. c Side chains of residues involving in conserved molecular interactions from CDR H1 (green) and H2 (blue) and the peptides are shown as sticks. The peptide bound to Fab395 is colored magenta, whereas others are in yellow. Hydrogen bonds are displayed as orange dashes.

The crystal structures of Fab224 and Fab399, which are derived from the same variable heavy chain IGHV3-49 gene, also display similarities in their recognition of the NANP repeats (Fig. 5a, b, and Supplementary Table 3 and 4). The epitopes are mainly composed of three NPNA motifs (Fab399 only has NPN in the last motif) with the first two NPNA motifs adopting a type I β-turn and the last motif exhibiting a type I β-turn in Fab224 and an Asn pseudo 310 turn in Fab399 (Fig. 5a, b). The interactions of Fabs 224 and 399 with the NANP peptides are mediated mainly through the heavy chain (Fig. 5a, b and Supplementary Table 5), which resembles those observed in the IGHV3-33 antibodies (Fig. 4a, b). The binding pocket formed by the IGHV3-49 heavy chains in these two Fabs both recognize the type I β-turn through nearly identical molecular interactions, including a CH-π interaction between the germline-encoded HPhe50 and Pro in the NPNA repeat (Fig. 5a, b). Other shared interactions involve hydrogen bonds and van der Waals interactions mediated by HArg52 and HTyr53, or the somatically mutated HPhe53 in Fab399, and hydrogen bonds from the CDR H3 backbone to the NANP peptide (Fig. 5a, b). Fabs 224 and 399 both have a Trp in CDR L3 (LTrp96) that hydrogen bonds with the backbone of the Ala preceding the conserved NPNA type I β-turn (Fig. 5a, b). However, their light chains are derived from different germline genes (Supplementary Table 5). The Fab399 epitope also shares striking similarities with the epitope in a recently published, IGHV3-49-encoded Fab4493 crystal structure with the junctional peptide, GNPDPNANPN24. The PDPNANPN core of the Fab4439 epitope displays a nearly identical conformation to the ANPNANPN residues in the Fab399 epitope, and both epitopes make similar interactions with HPhe50, HArg52, and LTrp96 in the antibody paratopes24.

Fig. 5: The paratopes of Fab224, 399, and 366.
figure 5

a Crystal structure of Fab224 encoded by the IGHV3-49 gene and its epitope are shown as cartoons with residues where the main chain and side chain engage in hydrogen bonds (orange dashes) shown as sticks. The Fab cartoon representation is embedded in a transparent white surface rendering. Only the CDRs involved in the binding groove are displayed and colored green, blue, magenta, light green, and pink for CDR H1, H2, H3, L1, and L3, respectively. A schematic of the epitope is also illustrated below. b The binding groove of Fab399 encoded by the IGHV3-49 gene. c The paratope of Fab366 encoded by the IGHV1-2 gene. All representations and coloring schemes for (b) and (c) are as in (a). In addition, a surface representation of Fab366 is shown, where the black and white surfaces represent the heavy and light chains, respectively.

We also determined the structure of Fab366 derived from the IGHV1-2 gene for which a structure has not been reported previously for anti-NANP antibodies. Its paratope is composed from CDR H1, H3, L1, and L3 without contribution from CDR H2, which is so dominant in Fabs from the IGHV3-33 and IGHV3-49 families (Fig. 5c, and Supplementary Table 4). The epitope (from N- to C-terminus) spans a shallow binding groove from light to heavy chain and makes extensive hydrogen bonds with Fab residues. One notable interaction is between LTrp96 and Asn5 in the peptide, which is similar to that observed in the Fab317- and Fab397-peptide complexes28,29 (Fig. 5c). Interestingly, the Fab366-bound peptide does not contain any secondary structural motifs (i.e., type I β-turn and Asn pseudo 310 turn), but adopts a more extended conformation. In fact, the observation of a shallow Fab paratope recognizing extended NANP repeats resembles the epitope of Fab1450, which also has low affinity and poor in vivo protection26 (Fig. 5c). Despite the extensive hydrogen-bonding network and BSA contribution within the range of the other Fabs analyzed here (Fig. 5c, and Supplementary Table 5), Fab366 has a fast koff, hence low affinity, (Table 2), which seems attributable to a shallow binding groove, consistent with its similarity to Fab1450.

Another possible structural correlate of high affinity and better protection is the presence of local secondary conformations in the NANP epitopes, which are the type I β-turn and Asn pseudo 310 turn with a hydrogen bond between the side chain of Asn (residue i) and backbone nitrogen of Asn (residue i + 2) (Fig. 6). The peptides in potent and high affinity IGHV3-33 antibodies (mAbs 239, 311, 356, and 364) contain both an NPNA type I β-turn and Asn pseudo 310 turn (Table 2, and Fig. 6). Interestingly, the spiral rsCSP contains these two turns as a repeating unit, as observed in the cryo-EM structure with Fab31127. The epitopes of protective mAbs 317, 224, and 399 consist of up to three type I β-turns, but also display backbone-to-backbone H-bonds, resulting in a more compact conformation, compared to those with the IGKV3-33 antibodies, and also higher apparent affinity to rsCSP (Table 2, and Fig. 6). The mAb397 epitope exhibits both type I β- and Asn pseudo 310 turns (Fig. 6), which correspond to the structural repeat in the long-range, curved conformation when multiple copies Fab397 bind to rsCSP29. However, other factors, such as antibody pharmacokinetics, may account for the poor protection despite high binding affinity of mAb397. In contrast, the low affinity and weakly protective mAb395 has a shorter epitope that contains one type I β-turn, whereas the extended epitope of mAb366 is stabilized by only one H-bond between the side chain of Asn3 (i) and backbone oxygen of Asn5 (i + 2), leading to a turn that is wider and more open than a type I β-turn (Fig. 6). Consequently, the extended epitopes of mAb366 when repeated in CSP may not be capable of adopting either the long-range curved or spiral conformations that are observed in mAbs 397 and 311, respectively.

Fig. 6: Summary of the epitope conformations for all antibodies analyzed in this study.
figure 6

The peptide epitopes are represented with yellow sticks, and the hydrogen bonds are shown in orange dashes. The type I β-turns and Asn pseudo 310 turns are highlighted with green and blue circles. All peptides were aligned based on the first type I β-turn that is present in the structure, except for the epitope of Fab366 which was aligned to the overall epitope of Fab239. The epitope of Fab317 and Fab311 were obtained from the crystal and cryo-EM structures from previous studies27,28 (PDB ID: 6AXL and 6MB3, respectively).


This study strengthens and extends previous observations of recurring features of antibody recognition to the NANP repeats that pertain to their functional activity. Notably, we explored antibodies that were encoded by different sets of IGHV genes, the mechanisms by which different antibodies can achieve high Fab binding affinity, and the consequences for functional protection in liver burden and parasitemia models of infection. We also identified two additional antibodies, mAb239 (IGHV3-33) and mAb399 (IGHV3-49), that display homotypic inter-Fab contacts, while simultaneously recognizing NANP repeat epitopes, as observed previously in mAbs 311 and 1210 (both from IGHV3-33)26,27. Fabs 239 and 311 exhibit an asymmetric, side-to-side Fab–Fab interaction with propensity to form a supramolecular, extended spiral structure with rsCSP, whereas Fab399 shows symmetric inter-Fab contacts with a head-to-head configuration that may not be consistent with forming the same type of spiral architecture, and may be more similar to that observed in Fab121026. We note that some of the key residues involved in homotypic Fab–Fab interactions are already present in the germline genes for both Fabs 239 and 399 (Fig. 2g). Indeed, most residues involved in inter-Fab contacts in Fab399 are germline-encoded, such as HThr28, HAsp31, HTyr32, and HArg94 (Fig. 2f, g). Therefore, these two antibodies may exhibit a propensity for inter-Fab contacts, even prior to somatic hypermutation (SHM). A similar observation is found in rituximab (RTX) against CD20, a completely unrelated therapeutic antibody34. Two copies of the Fab display homotypic interactions upon binding to their epitope on the CD20 dimer, resulting in a cross-linked circular supra-assembly of three RTX IgGs and three CD20 dimers34. Intriguingly, all residues involved in the RTX Fab–Fab interactions are also germline-encoded34.

The question remains whether the selection and subsequent maturation of these inter-Fab contacts impact antibody maturation or functional activity. The prevalence of an immunogenic repeat region in CSPs across different Plasmodium species10, albeit with distinct repeating sequence motifs, suggests that this region may act as an immune decoy35 by favoring the selection of antibodies with Fab–Fab interactions. A study of mAbs produced by immunization with Pf sporozoites indicates that clonal selection of higher-affinity, germline B-cell receptors (BCRs), rather than efficient SHM, seems to drive anti-NANP responses36. These germline BCRs may then represent the precursors of anti-NANP mAbs that display inter-Fab contacts. Consequently, it was hypothesized that high-avidity cross-linking of BCRs from homotypic contacts may signal B-cells to exit the germinal center either prematurely or with limited rounds of somatic mutation, especially in the antibody-antigen interface, and perhaps disfavor the formation of long-lived plasma cells, which are responsible for generating high amount of circulating antibodies as immediate responses to sporozoite invasion37. The increasing number of observations of homotypic Fab–Fab interactions in anti-NANP antibodies here and in previous studies26,27, and recently in an antibody targeting CSP on a different Plasmodium species, P. berghei38, support the above hypothesis. Fab–Fab interactions boost apparent antibody affinity through avidity effects and are also shared features among protective anti-NANP mAbs. However, such contacts may not be beneficial for the formation of immune memory and antibody potency and thus durable antibody responses, and further suggests that short NANP repeat based immunogens to prevent or modulate homotypic contacts might have some advantages to consider in the design of next-generation CSP-based malaria vaccines. Such immunogens date back to the 1980s when short NANP repeats [(NANP)3] conjugated to tetanus toxoid were tested in a human trial and induced strong anti-NANP antibody responses39, but no booster effect due to the lack of T-cell epitopes e.g. from other non-repeat domains40. Interestingly, recent immunization studies with a truncated CSP with only 9 NANP repeats induced lower BCR signaling in NANP-repeat-specific B-cells, stronger responses to N- and C-term epitopes, and protected more mice against mosquito bite challenge as compared to immunization with a longer CSP containing 27 NANP repeats41. Whether this approach will lead to more durable antibody responses and robust immune memory remains to be determined.

The crystal structures reported here of additional antibodies derived from the IGHV3-33 gene further emphasize the role of the conserved Trp52 in CDR H2 for interaction with Pro in the NANP repeats (Fig. 4). Alanine substitution not only of HTrp52, but also LTrp32, and HTrp33 in mAbs 311, 317, and 397, disrupts antibody binding considerably29. The convergent usage of Trp is also highlighted by the interaction between LTrp96 encoded by the light chain J gene in mAb366 with Asn in the NANP repeats (Fig. 5c), whereas LTrp96 in Fab224 and 399 hydrogen bonds with the alanine backbone in the NPNA peptide (Fig. 5a, b). We also observed that HPhe50 was present in two antibodies from the IGHV3-49 germline for interaction with Pro in the conserved NPNA type I β-turn, similar to the Trp in IGHV3-33 mAbs (Fig. 5a, b). Other aromatic residues, such as His and Tyr, can form van der Waals interactions with the peptide as seen in CDR H2 of IGHV3-33 antibodies (Fig. 4c). The high prevalence of such interactions between Fab aromatic residues and the NANP peptide is summarized for anti-NANP and anti-junction mAbs in Fig. 7. A recent cryo-EM structure of mouse mAb3D11 against the repeat region of P. berghei CSP reveals that 3D11 uses eight aromatic residues to form an aromatic cage for antigen recognition, with a germline-encoded Tyr from the light-chain playing a key role38. These structural insights suggest that the NANP repeats in PfCSP prime the human immune system to select antibodies from germline genes with well-positioned aromatic residues for the initial encounter. These favorable, dominant interactions with germline-encoded aromatic residues may limit SHM and represent another hurdle that the NANP repeats pose for eliciting durable and more potent human antibody responses.

Fig. 7: Summary of interactions between Fab aromatic residues and the NANP or junction region.
figure 7

The Fab aromatic residues and CSP peptides are shown in black and yellow sticks, respectively. Hydrogen bonds are represented as orange dashes. Residue identity and number (with H and L for heavy and light chain) and the corresponding antibody germline gene is indicated. Crystal structures of Fab1210, MGG4, CIS43, 1450, and 317 and 397 were obtained from previous studies22,23,26,28,29 (PDB ID: 6D01, 6BQB, 6B5O, 6D11, 6AXL, and 6UC5, respectively).

A recent study has shown that antibody affinity to NANP peptides correlates with inhibition of the parasite’s traversal of hepatocytes in vitro and also that antibodies with high affinity to NANP, rather than the other motifs (e.g., NPDP, or NVDP) in the junctional region, exhibit a high level of protection in the mouse model of parasitemia24. Here, we also demonstrate a correlation between apparent affinity, as measured against both NANP repeats and rsCSP, and in vivo protection in the liver burden assay (Fig. 3, and Tables 1 and 2). The parasitemia data also follow this trend, but are not a large enough dataset to assign statistical significance. Other structural and biophysical features, which include total paratope BSA, number of hydrogen bonds between paratope and epitope, and antibody melting temperature, do not appear to correlate with in vivo protection (Table 2 and Supplementary Fig. 6). Perhaps not surprisingly, the anti-NANP antibodies share a similar kon, but can differ substantially in their koff, which explains the lower affinities observed for mAbs 366 and 395 (Table 2). The koff then dictates the linear correlation with normalized percent inhibition of parasite burden (Fig. 3). However, the caveat for the current analysis is that only two antibodies, mAbs 366 and 395, in this study have low affinity and poor in vivo protection, and, hence, these two data points tend to dominate the regression models. As mAbs analyzed in this study were initially screened based on their avidity on ELISA, low-affinity antibodies are then likely to be underrepresented. To reduce model bias, we also performed bootstrapping to generate 1000 models for each analysis and observed lower average R2 values (R2-boot), but that still indicate correlation with in vivo protection (Fig. 3). Despite these limitations, our results should serve as an important platform for the development and engineering of anti-NANP mAbs, including antibody evolution using yeast display technologies.

Likewise, structural analysis reveals features on both the paratopes and epitopes that may contribute to low-affinity antibody and, consequently, poor protection. One correlate that we observe is that the high-affinity protective mAbs all recognize epitopes with secondary structural motifs, consisting of a type I β-turn and Asn pseudo 310 turn, which represent the repeating unit of the long-range spiral form of rsCSP (Fig. 6). Conversely, low affinity, less protective antibodies possess epitopes with few to no structural motifs (Table 2, and Fig. 6). A restricted binding groove, and consequently short epitope with a single type I β-turn, likely contributes to the low affinity of Fab395 (Figs. 4b and 6). On the other hand, the low affinity and less protective Fab366 recognizes an extended conformation of NANP repeats that lack any secondary structural motifs with its shallow groove (Figs. 5c, 6). Intriguingly, the non-protective and low-affinity Fab145026 also shares similar features of utilizing a shallow groove to bind an extended NANP epitope. Consequently, structural motifs such as the repeating type I β-turn and Asn pseudo 310 turn as in the spiral rsCSP could be incorporated into the design of next-generation immunogens, also with shorter length designs to prevent homotypic interactions. Future studies to explore anti-NANP mAbs from different germline genes and/or immunization trials will help verify this hypothesis and/or contribute additional structural properties that influence binding affinity and in vivo protection. Other factors, such as pharmacokinetics, may impact antibody protective capacity in vivo and require further examination. Overall, the findings here should aid in defining the optimal characteristics of anti-NANP antibodies for therapeutic use, and also guide the design of more effective vaccines against malaria.


Antibody isolation, cloning, and sequencing

Antibody isolation, cloning, and sequencing were performed in a previous study using the protocol outlined in the original paper42. Briefly, peripheral blood mononuclear cells (PBMCs) were stained with the following mAbs: anti-CD3-FITC (BioLegend, cat# 300406, clone UCHT1), anti-CD14-FITC (BioLegend, cat# 325604, clone HCD14), anti-CD19 -BV421 (BioLegend, cat# 302234, clone HIB19), anti-CD20-PerCP/cy5.5 (BD, cat# 340955, clone L27), anti-CD27-BV510 (BioLegend, cat# 302836, clone O323), anti-CD38-PE/cy7 (BioLegend, cat# 356607, clone HB-7), anti-IgA-FITC (Miltenyi, cat# 130-113-475, clone IS11-8E10), anti-IgM-APC/cy7 (BioLegend, cat# 314520, clone MHM-88). IgG+ plasmablasts were single-cell sorted into 96-well PCR plates containing hypotonic buffer (330 nM dNTPs (NEB), 1 μg/ml BSA (NEB), 2 mM DTT (Sigma), 0.5% IGEPAL-630 and 500 unit/mL of Ribolock (Thermo)) based on gating for CD19+CD20CD27+CD38++IgAIgM cells. Generation of barcoded cDNA, PCR amplification, and 454 sequencing of IgG were performed as described42 with the following modifications: Biotinylated Oligo(dT) and RT maxima H- (Fisher Scientific Company) were used for reverse transcription, cDNA was extracted using Streptavidin C1 beads (Life Technologies), DNA concentrations were determined using qPCR (KAPA SYBR® FAST qPCR Kit for Titanium, Kapabiosystems), and amplicons were sequenced using Roche 454 Titanium sequencing. Barcode assignment, sequence assembly, assignment of V(D)J, and identification of mutations were performed as described42 except for the following modifications: a minimum coverage of 10 reads was required for each heavy and light chain assembly to be acceptable. Heavy (light) reads were required to assemble unique contigs within a well. In cases where there was more than one contig, we rejected the well from consideration unless one of the contigs included at least 90% of the reads. V(D)J assignment and mutation identification was performed using an implementation of SoDA43. The sequences of mAb239, mAb337, mAb356, mAb364, mAb395, mAb224, mAb399, and mAb366 were derived from plasmablasts of protected vaccinees, of which the IgG antibody sequences belonged to expanded plasmablast IgG sequence families within the respective vaccinee’s repertoire. The collection of plasmablasts was part of the phase 2a clinical trial of RTS,S/AS01 vaccine with fractional third and fourth dose31, of which the protocol was approved by the Walter Reed Army Institute of Research (WRAIR) Institutional Review Board and the Western Institutional Review Board, and written informed consent was obtained from each subject before study procedures were initiated (Clinical identifier: NCT01857869). None of this work was performed as part of this study, but the antibody sequences used here were derived from that study.

Antibody production

For protection studies, all mAbs were made as IgG1 and expressed in Chinese hamster ovary cells (ExpiCHO; Thermo Fisher Scientific). The mAbs were purified using HiTrap Protein A HP column (GE Healthcare), followed by size exclusion chromatography (Superdex 200 16/90; GE Healthcare) and washed with 0.5 M Arginine in Dulbecco’s PBS pH 7.4 (DPBS: Thermo Fisher), which has been shown to be effective in removing endotoxins from IgGs44. The absence of endotoxin contamination was determined using Endosafe® nexgen-PTS™ portable endotoxin testing system (Charles River). For structural and biophysical characterizations, all Fabs were expressed in ExpiCHO cells and purified using a HiTrap Protein G HP column (GE Healthcare) followed by size exclusion chromatography as used for the IgG1 but in Tris Buffered Saline (TBS: 50 mM Tris pH 8.0, 137 mM NaCl, 3.6 mM KCl). rsCSP was expressed in E. coli (SHUFFLE cells; New England Biolabs) and purified as previously reported45. In short, a pET28a plasmid containing the rsCSP construct was used to transform E. coli SHUFFLE competent cells. A single colony was used to start a 50-mL overnight culture. Two 1-L cultures were inoculated the next day with 25 mL overnight culture and were grown at 37 °C. When the optical density at 600 nm reached a value of 1, the cultures were induced with 1 mM isopropyl β-D-1-thiogalactopyranoside for 6 h. The cells then were harvested and lysed by microfluidization. The lysate was incubated overnight with Ni cOmpleteTM resin (Roche) and was eluted using a buffer containing 200 mM imidazole. Typically, 10 mg of protein was obtained per 2 L of culture. All synthetic NANP peptides in this study were purchased from Innopep Inc. (San Diego, CA) with the purity level of 95% for structural studies and 98% for biophysical measurements.

Assessment of in vivo protection

Studies using mice were carried out using 6-8 weeks old C57BL/6 females, maintained at the animal facility of the Johns Hopkins Bloomberg School of Public Health. Mice rooms are kept at 40-60% relative humidity at a temperature of 68-79 °F with at least 10 room air changes per hour. Mice have a cycle of 14 h light and 10 h darkness. The assays using mice were performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. The protocol was approved by the Animal Care and Use Committee of the Johns Hopkins University, protocol number MO18H419. Experiments were performed with the protocol established previously32,33. Briefly, to measure liver burden, mice (N = 5) were IV injected with 100 µg of mAb per mouse and, 16 h later, challenged IV with 2000 P. berghei transgenic sporozoites expressing the P. falciparum CSP and luciferase. 42 h after challenge, mice were injected IP with 100 µl of D-luciferin (30 mg/mL), having previously been anesthetized by exposure to isoflurane. Bioluminescence in the liver was measured using an IVIS Spectrum (Perkin Elmer). For the blood-stage parasitemia study, mice (N = 6) were passively immunized with 100 or 300 µg of mAb and, 16 h later, recipient mice and controls were anesthetized with 2% Avertin prior to challenge by a 10-minute exposure to the bites of 7 mosquitoes of which 5 on average are infected with the transgenic parasite. Parasite infection of red blood cells was assessed from day 4 after a challenge by microscopic observation of blood smears.

Isothermal titration calorimetry

ITC experiments were performed on a MicroCal Auto-iTC200 (GE Healthcare). Prior to the measurement, all Fabs were extensively dialyzed against DPBS. The peptides were placed in the syringe at a concentration of ~150 μM for Ac-NPNA NPNA-NH2 (NPNA2; ~300 μM for Fab224 and ~500 μM for Fab399), ~60 μM (IGHV3-33 Fabs) or ~140 μM (other Fabs) for Ac-NPNANPNA NPNANPNA-NH2 (NPNA4), ~40 µM (IGHV3-33 Fabs) or ~60 μM (other Fabs) for Ac-NPNANPNA NPNANPNA NPNANPNA-NH2 (NPNA6), whereas the concentration of Fab in the cell was ~10 μM for all experiments. The Fab and peptide concentrations were determined by UV absorbance at 280 nm and 205 nm, where the molar extinction coefficients for the peptides at 205 nm were estimated using the Protein Parameter Calculator []46. The titrations were all performed with peptides in the syringe and antibodies in the cell and consisted of 16 injections of 2.45 μl peptide for experiments with NPNA2 and 32 injections for other experiments at a rate of 0.5 μl/s at 120 s time intervals, with injection duration of 4.9 s, injection interval of 180 s, and reference power of 5 μCal. Experiments were conducted in triplicate (N = 3) at 25°C. Fitting of the integrated titration peaks was performed with Origin 7.0 software using a single-site binding model. The first data point was excluded from the fit as common practice.

Bio-layer interferometry

The binding of all Fabs to biotinylated-rsCSP was measured using bio-layer interferometry (Octet Red; Pall ForteBio). Biotinylated-rsCSP were loaded onto streptavidin biosensors (Pall ForteBio, cat No 18-5019) at 10 μg/mL in kinetics buffer (TBS + 0.002 % Tween20 and 0.01 % BSA). The loaded sensors were dipped into solutions containing dilutions of each Fab in Kinetics buffer at a concentration of 1000, 500, 250, 125, and 62.5 nM, respectively (except for Fab317, the serial dilution concentrations are 250, 125, 62.5, 31.25, 15.63 nM, respectively). The binding experiments were performed with the following steps: 1) baseline in kinetics buffer for 60 s; 2) loading of rsCSP for 60 s; 3) baseline for 60 s; 4) association of antibody for 60 s; and 5) dissociation of antibody into kinetics buffer for 120 s. A reference well with no rsCSP loaded onto the sensor was run in all experiments and subtracted from sample wells to correct for drift and buffer evaporation. Octet assays were carried out at 25 ˚C. Data were analyzed using the Octet Red Data Analysis software version 9.0 and fitted with a 1:1 binding model.

Differential scanning calorimetry

The thermal stability of all IgG1 in Dulbecco’s PBS (Thermo Fisher) from 20 to 110 °C was measured using a MicroCal VP-Capillary calorimeter (Malvern) at a scanning rate of 90 °C/h. Data were analyzed using the VP-Capillary DSC automated data analysis software and fitted to a non-two-state model.

X-ray crystallography and structural analysis

Fabs 239, 356, 364, 395, 224, 250, 399, and 366 were concentrated to 10 mg/ml and mixed with either NPNA2, NPNA3, NPNA4, or NPNA6 peptide in a 1:5 molar ratio of Fab to peptide. Six substitutions and one deletion (from 112SSASTKG118 to 112VSRRLP117) were introduced into the elbow region of Fab395 and Fab366 heavy chains, and different mutations (from 112SSASTKG118 to 112FNQIKG117) were introduced to the elbow region of Fab364 heavy chain to stabilize the Fab and facilitate crystallization as shown in previous work47. Additionally, Fab239-NPNA2, Fab364-NPNA2, and Fab250-NPNA3 co-complexes were mixed with Streptococcal immunoglobulin G-binding protein G (domain III) in the Fab to protein G ratio of 1:1. Domain III of protein G has also previously been shown to enhance the crystallizability of Fabs48. Crystal screening of Fab-peptide complexes was performed using our high-throughput, robotic CrystalMation system (Rigaku) at The Scripps Research Institute, using the sitting drop vapor diffusion method with a 35 μL reservoir solution and each drop consisting of 0.1 μL protein + 0.1 μL precipitant. Fab239-NPNA2 co-crystals were grown in 0.2 M NaCl, and 20% (w/v) PEG 3350 at 20˚C as precipitant and were cryoprotected in 20% ethylene glycol. Fab356-NPNA2 crystals and Fab366-NPNA3 grew in 40% PEG-600, and 0.1 M CHES pH 9.5 with final pH of 9.6 at 4˚C. Fab364-NPNA2 crystals grew in 20% PEG-8000, and 0.05 M KH2PO4 at 4˚C and were cryoprotected in 20% glycerol. Fab395-NPNA2 crystals grew in 30% PEG-4000, 0.2 M ammonium acetate, and 0.1 M sodium citrate pH 5.6 at 20˚C. Fab224-NPNA4 crystals and Fab239-NPNA4 grew in 20% PEG-6000, and 0.1 M HEPES pH 7.0 at 20˚C and were cryoprotected in 20% ethylene glycol. Fab399-NPNA3 crystals grew in 20% PEG 3350, and 0.2 M potassium fluoride pH 7.2 at 20˚C and were cryoprotected in 20% ethylene glycol. Fab250-NPNA4 crystals grew in 1.6 M ammonium sulfate, and 0.1 M citric acid pH 4.0 at 20˚C and were cryoprotected in 20% glycerol. Fab399-NPNA6 crystals grew in 50% MPD, 0.2 M ammonium dihydrogen phosphate, 0.1 M Tris pH 8.5 at 20˚C. X-ray diffraction data were collected at the Advanced Proton Source beamline 23ID-B or beamline 23IDD, or at the Stanford Synchrotron Radiation Lightsource beamline 12-2, and processed and scaled using the HKL-2000 package49. The structures were determined by molecular replacement using Phaser50. Structure refinement was performed using phenix.refine51 and iterations of refinement using Coot52. Amino-acid residues of the Fabs were numbered using the Kabat system, and the structures were validated using MolProbity53. For structural analysis, buried surface areas (BSAs) were calculated with the program MS54, and hydrogen bonds were assessed with the program HBPLUS55.

Statistical analysis

The parasite liver burden load data (N = 5 mice) were compared for significance using a two-sided Mann-Whitney U test, whereas the blood-stage parasitemia data (N = 6 mice) were analyzed using the log-rank test, where p < 0.05 (*) and p < 0.01 (**) indicated levels of statistically significant differences. The liver burden data were reported as the geometric mean of the total flux ± SD (Fig. 1a). All statistical parameters for the mouse in vivo studies were calculated with the Hmisc (liver burden data), and the survival and survminer packages (parasitemia data), and the graphs were plotted with the ggplot2 package in R. Bootstrapping for the linear regression models was performed with the caret package and also plotted with the ggplot2 in R. Each ITC experiment was performed with three replicates (N = 3), and the data are reported as the arithmetic mean ± SD.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.