Sequence variation in Plasmodium falciparum Histidine Rich Proteins 2 and 3 in Indian isolates: Implications for Malaria Rapid Diagnostic Test Performance

Commercial malaria rapid diagnostic tests (RDTs) detect P. falciparum histidine rich protein 2 (PfHRP2) and cross react with PfHRP3, a structural homologue. Here, we analysed natural variations in PfHRP2 and PfHRP3 sequences from Indian isolates and correlated these variations with RDT reactivity. A total 1392 P. falciparum positive samples collected from eight endemic states were PCR amplified for Pfhrp2 and Pfhrp3 genes and were sequenced. The deduced protein sequences were analysed for repeat variations and correlated with RDT reactivity. Out of 1392 PCR amplified samples, a single sample was Pfhrp2 negative and two samples were Pfhrp3 negative. Complete Pfhrp2 and Pfhrp3 sequences were obtained for 769 samples and 750 samples, respectively. A total of 16 distinct repeat motifs were observed for Pfhrp2 and 11 for Pfhrp3, including some new repeat types. No correlation was found between variations in the size of Pfhrp2 repeat types 2 and 7, nor between any combinations of repeat motifs, and performance of a commercial RDT at low parasite densities. The findings suggest that sequence diversity in Pfhrp2 and Pfhrp3 genes in Indian isolates is not likely to negatively influence performance of currently used PfHRP2 RDTs.

which shares some of the same amino acid repeats present in PfHRP2 (types 1, 2, 4, and 7 among others), also contributes to reactivity of PfHRP2 based RDTs. It is assumed that antibodies utilized in commercial RDTs most likely recognize some of these repeats but actual specificity is not well described. As far as the functional relevance of these amino acid repeats in PfHRP2 for the performance characteristics of RDTs is concerned, no clear correlation between the number of repeats, size and other variations has been found 5 . However, by analysing a limited number of parasite isolate sequences with their corresponding RDT reactivity patterns at various densities, Baker et al. reported that the combined number of type 2 (AHHAHHAD) x type 7 (AHHAAD) repeats (cut off <43) could predict negative reactivity when parasite density is <250 parasites/µl 6 . Although this prediction was not confirmed in a subsequent study that included a large sample size (PfHRP2 in 458 isolates from 38 countries), the possibility remains that some of the variation in this protein may impact the performance of RDTs, especially when the parasite density is <200 parasites/µl 5 .
Complicating the performance of PfHRP2 based RDTs is the natural deletion of Pfhrp2 genes in parasite populations in some geographical regions. A substantial proportion of parasite isolates with both Pfhrp2 and Pfhrp3 gene deletions, have been reported, especially in the Amazonian region of South American countries and recently in some African countries [7][8][9] . As these parasites will escape detection by PfHRP2 based RDTs and may be selected to expand due to routine use of RDTs, there is a recommendation from WHO to conduct surveillance to detect such parasites in areas where PfHRP2 RDTs are commonly used. Recently, we reported that among 1521 microscopically confirmed P. falciparum parasites screened from eight malaria endemic states, 50 isolates were not detected by RDTs, 36 of them had deleted Pfhrp2 and 27 of them had deleted Pfhrp3 10 . Here, we describe the natural variations in the Pfhrp2 and Pfhrp3 genes from all samples that were successfully sequenced using isolates obtained from the same study in India and discuss their relevance for RDT reactivity in India.

Results
Sequence variation in Pfhrp2. Out of 1392 samples, 1391 samples were amplified for Pfhrp2 and one sample was negative. The quality of DNA was further analysed for this negative sample by amplifying the msp1, msp2, and glurp marker genes as recommended for determination of DNA quality 11,12 . All three marker genes amplified in this sample and a repeat amplification for the Pfhrp2 gene failed, thereby confirming a Pfhrp2 deletion. However, this sample was positive for the Pfhrp3 gene.
The size of the repeat region of exon 2 varied from 438 to 897 bases (mean 730). The difference in the mean size of exon 2 bases for each site is reported in Table 1. The size variation was largely attributed to the variation in numbers of 27, 18 and 9 bp repeats. Based on the data from this study the common arrangement of the repeats was drawn for HRP2 and HRP3 (Fig. 1). Among the 16 different types of repeats observed in this study, 13 were reported previously in other global populations of parasites 6 . The type 25 (AHHASY), type 26 (AHHAHHVSD) and type 27 (AHHSHHAAD) repeat motifs were unique sequences found in this study ( Table 2). The length of the deduced exon-2 Pfhrp2 sequence varies from 146 to 299 aa (mean 243 aa). When compared to the country mean, the mean of amino acid length was found to be significantly lower in both Gujarat and Tripura States (p < 0.05, Fig. 2).
Variation in the number of repeats. The total number of repeats and the number of each repeat within Pfhrp2 varied between isolates, both within and between sites (Tables S1 and S2). Repeat types 2, 6, 7 and 12 were observed in 100% of the isolates sequenced (Table 2). Repeat types 1, 3, and 10 were observed in 90% of the samples while types 5 and 8 were found in over 80% samples. The prevalence of repeat types 13, 14, 25, 26 and 27 were found to differ between sites from 3 to 10%. Further analysis was performed to examine the differences in the number of type 2 and 7 repeats in parasite isolates from different sites (Fig. S1). The type 2 repeats are the largest in size comprising about 47% of the total repeats followed by type 7 repeats accounting for 12% (Table S3).

Sequence variation in Pfhrp3.
Out of 1392 samples, 1390 samples were amplified for the Pfhrp3 gene.
Two negative samples had deleted the Pfhrp3 gene. The PCR products of 1390 samples were sequenced and 54% (750/1390) yielded good quality sequences. The size of Pfhrp3 repeat region of exon 2 amplified by PCR ranged from 336 to 729 bp (mean 504 bp). Out of the total 750 sequences, 137 unique Pfhrp3 sequences were found (KX679832-968). The length of the deduced Pfhrp3 sequence encoded by the repeat region of exon-2 varies from 112 to 243 amino acids (mean 168 aa). The mean amino acid length was significantly higher in Gujarat State and significantly lower in Chhattisgarh State when compared to the country mean (p < 0.05, Table 2). Overall, the level of diversity in the Pfhrp3, as measured by the proportion of sequence variations, is significantly lower than the Pfhrp2 gene (p < 0.05).
Eleven different repeats were present in PfHRP3 (Table 2). Nine repeats were identical to previously reported sequences 5 and two new repeat types were identified in this study. All the sequences started with a type 1 repeat (AHHAHHVAD) and ended with a type 4 repeat (AHH). The remaining repeat types accounted for the central part of the protein. The site and state-wide variations in repeats are summarized in (Tables S4 and S5). All isolates had at least a single stretch of 28 amino acid long non-repeat sequence. In some isolates, a second copy of the same non-repeat sequence was found (Fig. 1c). Although this non-repeat sequence is highly conserved, a limited number of non-synonymous polymorphisms were observed (Table S6). The repeat types 4, 7, 16-18 and 20 were found in 90% to 100% of isolates while the presence of other repeats varied. The number of repeats across different sites is highly variable including type 16 and type 17 repeats in parasite isolates from different sites (Fig. S1). The type 16 and 17 combined repeats are the largest in size comprising of about 50% of the total repeats followed (Table S7).
Scientific RepoRts | 7: 1308 | DOI:10.1038/s41598-017-01506-9 Correlation between repeat lengths and RDT positivity. The impact of natural variations in PfHRP2 repeats in field isolates on the performance of RDTs, especially in detecting low density infections in field settings, is not well understood. The relationship between number and combination of major repeat types 2 and 7 was investigated in low density infections (<1,000 parasites/µl) (Fig. 3). No direct relationship between these two type repeats and the ability to detect low density infections was observed. Indeed, there were 15 very low density infections (<200 parasites/µl) and there was no difference in the ability of an RDT to detect such infections due to variations in the size of repeats or in combinations of type 2 and 7 repeats (Fig. 3). Similarly, no relationships between type 6 and 10 repeats 'and RDT positivity were observed (data not shown).

Discussion
Malaria RDTs have become widely valued diagnostic tools for resource limited settings, including India, where a large burden of malaria occurs in economically disadvantaged tribal populations [13][14][15] . The successful implementation of evidence based treatment of malaria infections, as recommended by WHO, became feasible due to the availability of several commercial RDTs. PfHRP2 based RDTs are the most commonly used RDTs in the field and demonstrate a high detection rate; however, their accuracy is variable between similar products and even between different lots 16 . The performance of RDTs in the field can be influenced by a number of factors but the impact of high degree of variation in the PfHRP2 antigen sequences 17 , especially in the detection of low density infections at the threshold of detection limit (<200 parasites/µl), has been a subject of investigation 5,6 . Our investigation is one of the largest studies reported to date that determined the natural variations in the Pfhrp2 gene, and its paralogue Pfhrp3, in field isolates and examined the impact of such variation in detecting low density infections by a commonly used RDT.  This study clearly demonstrates the extensive variation in both Pfhrp2 and Pfhrp3 genes in Indian parasite isolates consistent with previous studies that have analysed variations in field isolates from various geographical regions, including India 5, 6, 18-20 . Genetic variation in Pfhrp2 and Pfhrp3 was found in all sites with different epidemiological characteristics (Table 1). Significantly higher genetic variations were observed from the states of Jharkhand and Maharashtra and significantly lower from the states of Madhya Pradesh and Rajasthan (Table S2). Although higher variation was also found in Odisha and Gujarat States, the differences were not statistically significant (p > 0.05).
As various repeats constitute the majority of amino acids in both PfHRP2 and PfHRP3, our analysis focused on the deduced amino acid variations, frequency of repeats and their organization. The overall size of repeat region amino acids varied considerably within and between sites. The type 2 repeats contributed to nearly half (47%) of all repeat amino acid sequences while type 7 repeats comprised 12%, this pattern is comparable to previous studies 5,6,18 . In the present study, the number of type 2 repeats varied from 12 to 15 and only the isolates from Madhya Pradesh showed significantly lower prevalence of type 2 repeats as compared to the country mean (p < 0.05). In contrast, the type 7 repeat was significantly higher in Madhya Pradesh compared to the country mean (Table S2). The type 1, 3, 6, 8, 10 and 12 repeats were the other most common repeats in the PfHRP2 while type 4, 5, 13, 14 and 19 and three new repeat types (25, 26 and 27) varied between isolates. Overall these results are consistent with previous reports 6, 19 . Types 1,4,7,15,16,17,18 and 20 repeats were the common repeats in PfHRP3 as reported previously from global isolates 6 . In this study, two non-repetitive regions were found in all the states; however, they were more common among Rajasthan and Tripura isolates. Baker et al. 5 , reported that only limited isolates from three countries showed two non-repetitive regions. The location of the non-repetitive region varied between our study and previous reports 6 (Fig. 1c).
We have reported previously Pfhrp2 and Pfhrp3 genetic deletions in the 50 samples that were RDT negative but microscopy positive 10 . It is important to note that we found one additional Pfhrp2 deleted and two Pfhrp3 deleted samples among the 1392 samples subjected for PCR amplification. This observation suggests that our approach, as recommended by WHO to test samples that failed RDTs after microscopic confirmation of parasitemia, did not underestimate the prevalence of Pfhrp2 and Pfhrp3 deletion in this study. However, it is important to point out that the only sample with Pfhrp2 deletion was positive for Pfhrp3 gene and this must have contributed to positive RDT detection of this sample as previously observed 21 . Similarly, the two Pfhrp3 negative samples were positive for Pfhrp2 gene accounting for their positive RDT detection.  Table 2. Types of Amino acid repeats in PfHRP2 and PfHRP3.
As Baker et al. demonstrated, that if a combined length of type 2 x type 7 PfHRP2 proteins was <43 repeat length, then it predicted the negative reactivity of RDTs at parasite densities <250 parasites/µl. This finding was made by testing the reactivity of 16 cultured isolates at different dilutions of in vitro grown parasites with a panel of RDTs and analysing the reactivity pattern using a binary regression model 6 . Although their own subsequent study involving large sample size failed to confirm the validity of their model 5 , some studies including a previous Indian study have reported some correlation between types 2 × 7 repeat lengths and RDT reactivity at low density 18,19 . In the present study, we compared the RDT reactivity pattern of these samples with their own deduced amino acid variation, which reflects a direct correlation between RDT reactivity and variation pattern. The SD Bioline RDT used in this study was found to show a lower sensitivity at densities <1,000 parasites in previous field evaluations in India 14 . We compared the RDT reactivity pattern of all samples with a parasite density of <1,000 parasites with the repeat length of type 2 × 7 (Fig. 3). Interestingly, we did not find any correlation between length variation of these repeats and combined length of type 2 and 7. Importantly, there were several isolates with low density parasitemia (<200 parasites/µl) as well as <43 amino acids of combined repeats of type 2 × 7 but were clearly detected by the RDT 5 . Overall these results are consistent with Baker et al. 's subsequent conclusion that variations in length of any of repeats or type 2 × 7 combination did not show any correlation with RDT reactivity at lower parasite density (<200 parasites/µl). Based on this observation, we suggest that it will be worth comparing the variation between RDT reactivity and repeat variations in the field, rather than in in-vitro cultured parasite dilutions to further confirm current observations. The limitation of the study is that we could not analyze all available samples as we had to exclude samples that did not yield good quality sequences and these samples were also not attempted for further re-sequencing.

Conclusion
This study provides the first country-wide data on genetic diversity of P. falciparum hrp2 and hrp3 genes. The findings confirm that both Pfhrp2 and Pfhrp3 had extreme variation between the type and number of the repeats. There was no correlation between length variation of the repeats type 2 and 7 with RDT positivity, even at low density parasitemia. Among the RDT positive samples, only a single sample was Pfhrp2 deleted and two samples were Pfhrp3 deleted. Our findings may lead to a better understanding of the Pfhrp2 structure and how its variation contributes to the RDT positivity and in turn, also help in the generation of improved malaria RDTs.

Material and Methods
Study details and sample source. In a previous study, we determined the natural genetic variation of Pfhrp2 and Pfhrp3 genes in Plasmodium falciparum samples obtained from a total of sixteen sites from eight malaria endemic states in India with a low level prevalence of Pfhrp2 and Pfhrp3 deleted parasites 10 . These samples were used in this investigation to characterize natural variations in Pfhrp2 and Pfhrp3 genes. Details of individual study sites and parasitemia data are reported in Table 1. Out of 16 sites, eight exhibited high level of malaria endemicity (Annual parasite incidence >5) and 8 sites had low malaria endemicity (Annual parasite incidence <2). Out of 16 sites, P. falciparum positive blood samples were collected from 15 sites. The epidemiological characteristics and details of the study sites have been previously described 3,10 .
Ethical approval. The study protocol for patient participation and collection of blood samples for laboratory testing from participants were approved by the institutional ethics committee of National Institute for Research in Tribal Health (NIRTH), Jabalpur. Before collecting the samples, written informed consent was obtained from all study participants or from the parents/guardian of children, as per the guidelines of the Indian council of medical research. A copy of the consent form in the local language was also provided and explained to the patients or parents/guardians of children. The participation by a CDC investigator was approved under a non-research determination by the Center for Global Health, CDC, Atlanta.