Main

By the end of 2002, the genome sequences of many of the key species in the life cycle of the malaria parasite will be available1,2,3. This includes: (1) the 25–30 million base pairs (megabase or Mb) of the genomes of Plasmodium falciparum — the parasite responsible for more than 95% of all malaria deaths — and P. yoelii, a rodent parasite; (2) the 280 Mb of the genome of Anopheles gambiae, the most important mosquito vector of P. falciparum in sub-Saharan Africa, where malaria causes the most deaths; and (3) the 2,900 and 2,600 Mb of the human and mouse genomes, respectively. Here, we summarize the current status of the Plasmodium, human and Anopheles genome sequencing projects. We show how a number of fields of biology, based on genomics, enabled and integrated by bioinformatics, and often referred to collectively as 'functional genomics', are being brought to bear on malaria research, and how we think these approaches will improve the chances of success in our battle against malaria.

Status of Plasmodium spp. sequencing projects

To sequence the genome of P. falciparum4, a consortium of genome centres and funders was established in 1996, which included The Institute for Genomic Research and Naval Medical Research Center (TIGR/NMRC), Sanger Institute and Stanford University (Table 1). The 25-Mb P. falciparum (clone 3D7) genome, comprising 14 chromosomes and an estimated 5,000 genes, has been sequenced using a whole-chromosome shotgun strategy. Additional sequence information includes the 6-kilobase (kb) mitochondrial genome5 and a 35-kb circular DNA that localizes to a novel organelle called the apicoplast6. The sequences of chromosomes 2 and 3 were published in 1998 and 1999 respectively7,8, and a draft sequence and annotation of the entire genome will be published in 2002.

Table 1 Current status of Plasmodium sequencing projects

Experimental and computational hurdles had to be overcome to complete sequencing and assembly of this genome7,8,9. Large genomic fragments of P. falciparum DNA are not stable in Escherichia coli, a problem thought in large part to be due to the skewing of nucleotide content from the 59% adenine and thymine (A+T) of the human genome and 49% of the E. coli genome to the 80% found in the P. falciparum genome. The presence of A+T repeats/homopolymers has also made gap closure and accurate assembly difficult. New techniques9 and annotation software10 had to be developed to overcome these problems. Additional resources used to facilitate physical gap closure included use of sequence-tagged site markers derived from end sequences of yeast artificial chromosomes (YACs) previously mapped to the genome, microsatellite markers and a high-resolution linkage map11. Additionally, groups of contigs were ordered on the chromosomes by reference to optical restriction maps12. The genome-wide, high-resolution linkage map comprises 901 markers that fall into 14 inferred linkage groups with a high rate of uniform meiotic crossover activity11.

Progress in other Plasmodium spp. sequencing initiatives is summarized in Table 1. Based on analysis of the P. falciparum genome sequencing effort, it was estimated that a three- to fivefold shotgun sequence coverage of the entire genomes of other Plasmodium spp. would give 90–95% of the information provided by the significantly more expensive and time-consuming 10–15-fold coverage on a chromosome-by-chromosome basis of the P. falciparum genome. A fivefold shotgun sequence of the genome of the rodent parasite P. yoelii has been completed, and will be published in 2002. It is anticipated that a threefold shotgun sequence of the genomes of the rodent parasites P. chabaudi and P. berghei will be completed soon, as will a fivefold coverage of the human parasite P. vivax, and of the simian parasite P. knowlesi. Additional ongoing efforts are directed at sequencing expressed sequence tags (ESTs; Table 1) of P. falciparum, P. yoelii13, P. berghei14 and P. vivax14. As genomic sequence from other related apicomplexans, such as Cryptosporidium and Toxoplasma, become available, comparative genomic analyses will enhance understanding of these protozoan pathogens.

To facilitate interpretation and dissemination of genomic sequence data to researchers worldwide, informatics capabilities that integrate sequence data, automated analyses and annotation data emerging from the P. falciparum genome project have been developed (ref. 15; and Table 1).

Insights from P. falciparum genome data

The biological insights gleaned from publicly available genomic sequence data have already led to the identification of important leads for antimalarial drugs and potential targets for vaccines (refs 1618; and see Table 2). Furthermore, computational analysis of published genomic sequence data has improved our understanding of the evolutionary and functional adaptations of P. falciparum.

Table 2 Highlights of data generated from the Plasmodium genome project

Target and lead identification for drugs based on genome sequence data

A rapid demonstration of how genomic sequence facilitates target and lead identification occurred almost as soon as the preliminary, uncompleted sequence of chromosome 2 was posted on the TIGR website. Using homology to known genes in plants and algae, researchers identified several lipid biosynthesis genes in P. falciparum and demonstrated their targeting to the apicoplast19. These genes are attractive drug targets given their critical role in parasite viability and the absence of the type II fatty acid biosynthesis pathway in humans; at least one major pharmaceutical company has assessed the effect on malaria parasites of compounds known to be active in this pathway. In follow-up, using sequence data on the Sanger website, researchers identified a key enzyme in this pathway — enoyl–acyl-carrier protein (enoyl–ACP) reductase (FabI) — and demonstrated in vitro activity against P. falciparum using an inhibitor to this enzyme17.

Another effort was based on unpublished data from chromosome 14 posted on the TIGR website. Researchers used genomic sequence data to identify the enzymes 1-deoxy-d-xylulose-5-phosphate (DOXP) synthase and DOXP reductoisomerase, and provided evidence that isoprenoid biosynthesis in P. falciparum depends on the DOXP pathway and is critical for parasite viability18. They identified two drugs known to be active in this pathway, fosmidomycin and its derivative FR-900089, and demonstrated activity in vitro against P. falciparum, including against multidrug-resistant P. falciparum, and in vivo against P. vinckei in mice18.

The value of genomic sequence data and comparative analyses to identify new drug targets was shown recently with the identification in P. falciparum of the metal-dependent RNA triphosphatase protein family, members of which are crucial in mRNA cap formation and eukaryotic gene expression20. The structure of the active site and catalytic mechanism of this protein family in P. falciparum and fungi are completely different from the RNA triphosphatase domain of the metazoan (human) capping enzymes, and metazoans encode no identifiable homologues of the fungal or Plasmodium RNA triphosphatases. The structural similarity between the plasmodial and the fungal RNA triphosphatases raises the exciting possibility of achieving antifungal and antimalarial activity with a single class of mechanism-based inhibitors20. Protein families that are expanded in P. falciparum and are considered attractive drug targets include the cysteine and aspartyl class of proteases, with both classes constituting a parasite adaptation to its need for haemoglobin digestion7,21,22. Of special interest is the representation of several members of the recently described metacaspase family of cysteine proteases in Plasmodium23. The absence of this class of proteases in humans makes the P. falciparum metacaspases particularly attractive as drug targets.

Target identification for vaccines based on genome sequence data

P. falciparum proteins that traffic to the erythrocyte surface are important in the pathogenesis of malaria, and are potentially important targets for vaccine development24. Genomic sequence data have demonstrated the extent of paralogous expansion of genes encoding such parasite proteins. Several members of these multi-gene families are being considered as vaccine candidates given their prominent role in malarial pathogenesis (refs 24,25; and Table 2, Fig. 1). Prominent among these are the variant gene families clustered in the subtelomeric regions of the P. falciparum genome (var genes encoding P. falciparum erythrocyte membrane protein (PfEMP), and rifin, stevor and clag genes)7,26,27,28 and P. vivax genome (vir genes)29. Furthermore, the availability of genomic sequence data has provided invaluable clues into the regulation of the large variant-antigen PfEMP gene family7,8,30. Other attractive targets for vaccines are parasite proteins important for parasite invasion of erythrocytes (see reviews in this issue by Richie and Saul, pages 694–701, and Miller et al., pages 673–679). Here again, genomic sequence data have enabled the identification and characterization of paralogues of the Duffy-binding-like domain-containing, EBA-175-like gene family members31,32, and the rhoptry33 and reticulocyte binding protein34 families involved in red cell invasion.

Figure 1: Protein domain architecture in representative Plasmodium proteins.
figure 1

Domain name abbreviations: DBL, Duffy-binding-like; CIDR, cysteine-rich interdomain region; CLAG, cytoadherence-linked asexual gene; RIF, Rifins; VWF, von-Willebrand factor A; TSP, thrombospondin; EGF, epidermal growth factor-like; Perforin, membrane-activated complex-perforin; STK, serine/threonine kinase; EF, calcium-binding EF-hand; SAM, sterile alpha-motif; K, kelch repeats; PP2A, PP2A phosphatase; DnaJ, DnaJ family of chaperones; Pept_C1, papain cysteine protease; Asp_Pr, aspartyl protease.

Although most of these merozoite and erythrocyte surface proteins have plasmodial-specific domains, it is of interest to note that there are several examples of plasmodial proteins with extracellular binding domains that are found predominantly in animal genomes (Table 2, Fig. 1). These include the epidermal growth factor domain-containing merozoite surface proteins and Pfs25/28 (P. falciparum gametocyte surface protein) that have homologues in other plasmodium species35,36. Likewise, thrombospondin and von-Willebrand factor A domain-containing proteins such as circumsporozoite protein (CSP), sporozoite surface protein 2/thrombospondin-related adhesive protein (SSP2/TRAP) and CSP-TRAP-related protein (CTRP) are involved in the gametocyte and sporozoite stages of the life cycles37,38. Many of these proteins are central to initiating infection of the host, elicit a protective immune response, and are being developed as vaccine candidates (see reviews by Richie and Saul and by Miller et al. in this issue) The complete genomic sequence allows for rapid identification of all members of these families and their investigation as potential targets for vaccine development. Other interesting examples of animal extracellular adhesion domains detected in predicted plasmodium proteins include the LCCL (domain identified in Limulus factor C, Coch-5b2 and LGL1) module39 and perforin domain-containing proteins40 in the P. falciparum genome.

Finally, over 40% of predicted proteins in the genome are predominantly low-complexity or non-globular proteins with several instances of low-complexity inserts in the midst of globular protein domains7. This may reflect an adaptive mechanism of the parasite to evade an antibody-driven immunity and may impact vaccine design to improve specificity of antibody response.

mRNA and protein expression to identify drug and vaccine targets

Genomic sequence data and new technologies have provided the foundation for studies to determine mRNA and protein expression at different stages of the parasite life cycle41. Over 3,000 random inserts from a P. falciparum mung bean nuclease genomic library were used in a DNA microarray to identify differences in gene expression between the asexual blood-stage trophozoite and the sexual blood-stage gametocyte form of the parasite42. These experiments increased the list of stage-specific transcripts in P. falciparum by an order of magnitude, provided potential targets for developing drugs and vaccines, and yielded clues as to how to explore differences in the metabolic machinery of the parasite as it transits from humans to mosquitoes43. A DNA microarray based on genes from chromosomes 2, 3, 12 and 14 was used to assess differential parasite gene responses to antimalarial drugs in drug-sensitive and drug-resistant strains at different times during the asexual erythrocytic stage of the life cycle (ref. 41; and A. Whitney, unpublished data). A different approach, based on sequencing of a P. yoelii sporozoite complementary DNA library, helped identify genes expressed in the sporozoite stage of the life cycle13. In addition to identifying over 1,300 new sporozoite-expressed genes, this study documented the sporozoite-stage expression of potentially important surface molecules.

Work has also begun to identify proteins expressed at specific stages of the parasite life cycle41; data that will be especially important for vaccine development. There are multiple methods available for conducting such studies, but the most progress to date has used recent technological advances in liquid chromatography–mass spectrometry (LC-MS), wherein mass spectral fingerprints that are experimentally derived may be matched to computationally generated profiles derived from the genomic sequence (refs 41,44; and D. Carucci, personal communication). The availability of an accurately assembled and annotated complete genome is critical for the success of these ongoing efforts.

Status of the human genome project

The 2,900-Mb human genome, with approximately 30,000 predicted protein-coding genes1,2 and a genome-wide catalogue of several million single nucleotide polymorphisms (SNPs), is now available for researchers to better define the genetic determinants of human disease1,45. Furthermore, ongoing efforts at haplotype analysis will help discern disease-related SNPs from the background of irrelevant genetic variation46. Likewise, efforts are underway to obtain portraits of gene and protein expression profiles from human tissue from a wide range of anatomic compartments in the body during different states of health and disease and over time. These projects are in large part dependent on the availability of the complete genomic sequence. The relevance of the human genome sequence to human disease and medicine has been discussed in a recent publication47.

Relevant insights from the human genome project

Human genetic variations are one of the principal determinants of susceptibility to many common infectious diseases48. Malaria was one of the first infectious diseases to be studied extensively and many susceptibility and resistance loci have been identified over the past few decades49. The most prominent human polymorphism that protects against severe malaria is the sickle cell trait, although other haemoglobinopathies or blood group antigen variants have been shown to influence clinical outcomes by altering the efficiency of invasion of erythrocytes by the parasites or the development of the parasites within erythrocytes.

Other determinants of clinical severity include receptor polymorphisms in genes encoding proteins such as intercellular adhesion molecule 1 (CD54) and CD36, which are involved in the sequestration of infected red cells in vascular endothelium, or polymorphisms in the human tumour-necrosis factor-α gene promoter that are associated with cerebral malaria50,51,52. Although a dozen or more specific susceptibility determinants have been defined so far, including both structural and regulatory polymorphisms of red cell proteins and genes of the immune system, it is most likely that many more are yet to be discovered49. It must also be emphasized that efforts made so far do not represent a methodical and comprehensive survey of the genome. As the human genome sequence and the extensive catalogue of SNPs have been publicly available only since early 2001, we have yet to reap the full benefits of this tremendous resource for malaria research.

Status of the Anopheles genome project

The A. gambiae genome project is funded largely by the National Institute of Allergy and Infectious Diseases and the French Government, with the World Health Organization/Tropical Disease Research programme providing a coordinating role and assisting in database development. Celera Genomics and Génoscope have recently completed a shotgun sequencing effort (10 × coverage), which has been assembled by Celera. Two independent annotations, one by Celera and the other by the European Bioinformatics Institute using Ensembl should be completed by early 2002. The annotated genome may be presented publicly as early as March 2002 either through Ensembl or through an adapted version of the Drosophila GadFly. End sequences of two different A. gambiae bacterial artificial chromosome (BAC) libraries (>10 × coverage), the physical map location of 2,000 BAC clones, perhaps as many as 80,000 ESTs, and several sequenced and annotated genomic DNA contigs (100–500+ kb) will facilitate assembly and annotation. Celera Genomics, the European Molecular Biology Laboratory, Génoscope, the Institute of Molecular Biology and Biotechnology in Crete, the Institut Pasteur in Paris, TIGR, and the Universities of Iowa, Rome and Notre Dame are contributing.

All genomic libraries used in this project have been made from a strain of A. gambiae (PEST) that has a sex-linked, pink-eye mutation53 and a standard chromosome arrangement54. The shotgun sequence has been obtained from plasmid libraries with 2-kb, 10-kb and 50-kb inserts produced at Celera. All BAC libraries, sequenced cDNAs and the end-sequenced 10-kb plasmid clones will be archived at the American Type Culture Collection as part of the National Institutes of Health-supported Malaria Research and Reference Reagents Resource Center (http://www.malaria.mr4.org/mr4pages/index.html).

Relevant insights from the Anopheles genome project

The 120-Mb genome of fruitfly, Drosophila melanogaster, provided the first complete invertebrate insect genome sequenced using the shotgun sequence and assembly approach55. The completion of the 280-Mb Anopheles genome will undoubtedly enhance our understanding of the evolutionary adaptations of mosquitoes since their divergence from the fly approximately 200 million years ago, including traits such as the physiology of blood feeding and blood meal digestion and behaviours associated with the selection of humans as the blood meal source and oviposition in aquatic sites produced by human agricultural activity. Most of the specific EST efforts in this project are focused on mosquito tissues with which malaria parasites interact56, including the midgut, haemocytes, fat body and salivary glands, or on genes expressed in tissues such as the head and antennae that may mediate behaviours such as host selection and mating. A number of current research efforts are motivated by an interest in the mosquito's response to infection with the malaria parasite, as well as the basis whereby several selected strains of A. gambiae are able to mount a defensive response to Plasmodium57,58.

Comparative analysis of the Drosophila, Anopheles and human genomes has already provided several practical insights. The shared aspects of innate immunity between the two insects and humans have been confirmed by computational1,59 and experimental evidence56,60. Of special interest are the specific adaptations of haematophagous insects (including Anopheles) that promote transmission by enhancing blood feeding. These include the convergent evolution of proteins that interact extensively with mammalian proteins which regulate haemostasis and the complement cascade61. On a larger comparative scale, preliminary examinations of conserved gene arrangement between Drosophila and A. gambiae suggest that substantial gene order arrangement may be preserved at a scale of up to several hundred kilobases, and very large blocks of most chromosome arms may be shared62. The ability to compare mosquito and Drosophila genes in terms of sequence similarity and location will be important in guiding researchers towards an understanding of mosquito gene function.

What does genomics offer for the control of malaria?

There is currently no vaccine for malaria, no optimal way to sustainably reduce or eliminate contact between humans and infective mosquitoes, a poor system for early diagnosis and recognition of those at greatest risk for developing severe disease, and a disturbing, ever increasing resistance to the drugs used to treat malaria. We believe that genomics and related disciplines provide the scientific foundation for improving our chances to address and solve each of these problems63.

Vaccine development

Effective vaccines elicit immune responses that destroy the infectious agent or the host cells in which they reside, or inhibit a function critical for its survival. Immune responses against P. falciparum can protect through both mechanisms. Immunization with radiation-attenuated sporozoites protects more than 90% of recipients against experimental challenge for at least 10 months64. Protection is thought to be mediated primarily by T cells against peptides from parasite proteins expressed in infected hepatocytes (function independent), although antibodies that reduce sporozoite invasion of hepatocytes (function inhibiting) also have a role (refs 6567; and Fig. 2). In areas of Africa with the most intense malaria transmission, children who survive to the age of 7–10 rarely develop life-threatening P. falciparum infections. They become infected frequently, but their immune systems limit the infections, thereby preventing severe disease. Antibodies against parasite proteins expressed on erythrocytes, which prevent sequestration in the microcirculation (function inhibiting)25, and on the surface of merozoites, which prevent invasion of erythrocytes (function inhibiting)68 or initiate antibody-dependent cellular inhibition69 activity (function independent), are thought to be primary in this 'naturally acquired' protective immunity.

Figure 2: Malaria 'vaccinomics'.
figure 2

The potential impact of genomics and functional genomics on identification of new functionally or anatomically important targets for malaria vaccine development, and optimizing selection of human populations for immunization with specific vaccines.

During the past 20 years there has been considerable work to develop subunit vaccines that provide protection after exposure to irradiated sporozoites and naturally acquired immunity (see review in this issue by Richie and Saul, pages 694–701). Such a vaccine has yet to be developed. There are numerous potential explanations for the lack of current success. One is that exposure to the whole parasite elicits a more potent, protective immune response than do the subunit vaccines tested thus far. However, in the case of people protected by the irradiated sporozoite vaccine, or by naturally acquired immunity, T-cell and antibody responses are generally modest, and in many instances lower than after subunit vaccination. It is more likely that immunization with only a few parasite proteins cannot duplicate the immunity elicited by exposure to a parasite that has thousands of proteins; modest immune response against tens, hundreds or thousands of parasite proteins may be additive or synergistic. In the case of naturally acquired immunity, this 'breadth' would be expanded by exposure to so many polymorphic strains of P. falciparum. If this is the case, then the malaria genome and SNP projects may provide the essential foundation for duplicating this whole-organism immunity.

Analysis of the sequence of the P. falciparum genome has dramatically expanded our knowledge regarding the paralogous expansion in the genome of parasite proteins expressed on the erythrocyte surface (Table 2, Fig. 1). But genomics on its own will not be enough70. Immune responses are directed against proteins, not genes, and the parasite expresses different proteins at different stages of its life cycle70.

We believe that the first step in duplicating whole parasite-induced protective immunity is to catalogue expression of proteins at each stage of the parasite life cycle. The first step to accomplishing this could involve generating antibodies to every parasite protein and determining their subcellular localization70. This cannot be done rapidly and without great expense, so other approaches have been initiated. Assessment of mRNA and/or protein expression at different stages of the parasite life cycle would provide comprehensive data. Unfortunately, neither is 'the best' way. Immune responses recognize proteins, not mRNA, so gene expression profiles must be confirmed at the protein level. Furthermore, there are many instances (for example, alternative splice sites or post-translational modifications) in which mRNA expression data will not be adequate. The inability to culture infective P. falciparum sporozoites and the poor performance of hepatocyte cultures make it difficult (sporozoites) or impossible (liver-stage parasites) to acquire enough parasite material for conventional, non-amplified gene chip or DNA microarray analysis of these critical stages of the life cycle. For these reasons there has been increasing emphasis on LC-MS methods for establishing differential protein expression profiles at each stage of the life cycle41.

We anticipate the generation of comprehensive databases that catalogue stage-specific expression of all genes and proteins in the P. falciparum genome, and the polymorphisms in the DNA sequences of these genes in field isolates. Concurrent efforts to establish the functional importance of each gene, and the significance of their protein–protein interactions will take longer. Nonetheless, characterization of genes, their transcripts, and the proteins they encode will provide the foundation for the development of effective vaccines.

In subsequent sections we address how genomics-based data may be helpful in focusing vaccination strategies. This will include the need to identify patients at high risk, as well as a 'systems biology/immunology' approach to monitor immune response, wherein molecular kinetic 'portraits' of individuals should provide a more comprehensive indication of the response to infection and immunization than do currently available cross-sectional single immune response measurements (Fig. 2, Table 3). Significant difficulties lie ahead in developing vaccines that duplicate the protection provided by immunization with irradiated sporozoites or naturally acquired immunity (or that work through other mechanisms), delivering those vaccines optimally to the individuals who need them most, and effectively monitoring responses to those vaccines. In addition to increased understanding of the target of immune responses (the parasite proteins and epitopes within those proteins), work in vaccinology that takes advantage of the human genome project to develop improved methods for maximizing magnitude, quality and longevity of protective immune responses will be fundamental to fielding effective malaria vaccines.

Table 3 Integrative analysis of malaria using genome-based computational and experimental approaches

Reducing contact between humans and infective mosquitoes

The sequence of the A. gambiae genome will accelerate development of new malaria control tools to supplement existing insecticides and bednet strategies, not only by advancing current research efforts, but also, and perhaps most important, by opening lines of research not currently envisaged. The dire need for new vector-related tools and strategies for malaria control is obvious given that most of the world's successful malaria control programmes have utilized primarily transmission-reduction tools such as insecticides and bednets71.

Preserving the efficacy of insecticides in the face of resistance The most important vector-targeted control strategies today involve insecticide-impregnated bednets. Unfortunately, resistance to the pyrethroid insecticides used in bednets has now been recorded in A. gambiae populations in both east and west Africa as well as in A. funestus (another major vector in Africa) from southern Africa71. One of the main causes of resistance is upregulation of groups of insecticide-detoxifying enzymes such as oxidases, esterases and glutathione S-transferases72. The A. gambiae genome will enable rapid identification of the genes involved and the molecular changes that lead to resistance, information that may guide in the selection and/or development of alternative insecticides. It will also provide tools that can be used in population studies to monitor the emergence and spread of important traits like insecticide resistance.

Increasing the efficiency of vector-targeted control efforts A. gambiae is a member of a complex of what are now recognized as seven related but genetically and ecologically distinct species71. Decades of field work in Africa, where the vector was identified as A. gambiae sensu lato, have produced a picture of A. gambiae ecology that is probably the composite of data from at least two or even more members of this complex. Recent studies in west Africa suggest A. gambiae sensu stricto may comprise at least two and possibly as many as five distinct, reproductively isolated cytogenetic forms73. An immediate outcome of the A. gambiae genome project should be development of simple, DNA-based diagnostics for these different forms that can be used to more fully understand cytotype-specific differences in vector behaviour, ecology and pathogen transmission.

Development of a genetic control strategy for A. gambiae The A. gambiae genome project builds on a decade of progress in developing tools for the genetic control of malaria transmission74. Robust germline-transformation tools now exist for Anopheles mosquitoes75. Significant progress has been made in the study of parasite development in the Anopheles vector58,71. For example, it has now been shown that normal parasite development can be almost totally blocked in Anopheles mosquitoes transformed with a construct that binds to both midgut and salivary glands (M. Jacobs-Lorena, personal communication). Considerable progress has also been made in studies of A. gambiae population structure, using tools that emerged from laboratory cloning and mapping efforts71. The A. gambiae genome project will help solve the problems that must be overcome before even a limited genetic-control field trial can be implemented.

Early diagnosis and predicting risk of developing severe malaria

To achieve the goal of reducing morbidity and mortality caused by malaria, it will be imperative to target expensive resources to those at highest risk of suffering severe disease and death. We believe that genomics-based efforts may facilitate this approach.

Early diagnosis of infection Conventional approaches for diagnosing P. falciparum infection include microscopic examination of blood smears, the use of dipsticks that assay P. falciparum histidine-rich protein 2, and the less practical amplification of P. falciparum genes using the polymerase chain reaction (PCR)76. Gene expression and proteomics profiling studies will probably provide additional markers. We speculate that several of these protein- or metabolite-based markers could be adapted in antibody-based assays to enhance detection as well as provide surrogate markers of drug resistance and clinical outcomes (Table 3).

Rapid identification and monitoring of drug resistance It is imperative that drug-resistant strains of P. falciparum be detected when they first emerge. Genetic and genomic sequence has been the basis of PCR-based screens against individual P. falciparum genes used to document and monitor resistance to chloroquine and sulpha-based treatment regimens77,78. The genome-wide, high-resolution linkage map11 is complementary to sequence data and will lead to determination of most drug-resistant genes and their polymorphisms. The capacity to detect P. falciparum metabolite and protein levels during the erythrocytic stages of its life cycle using mass spectrometry should lead to the development of clinically relevant, rapid methods for determining drug resistance, and early recognition of failing therapy.

Identification of individuals at risk of developing severe disease Case–control and association studies that have been used successfully to dissect the genetics of human disease will benefit significantly from the genomic sequence and availability of genetic markers48. Additionally, the recent completion of the mouse genome and the comparative genomic map of the two species will complement ongoing efforts to dissect determinants of host susceptibility and immune response in murine models of malaria3,79. Future efforts aimed at identifying genetic correlates in populations at risk with biological correlates of parasite invasion, parasite development, cytoadherence efficiency and clinical outcomes will be critical in identifying at-risk populations to better direct vaccination and chemoprophylaxis strategies (Table 3, Fig. 2). Similar efforts are needed to develop pharmacogenetics in malaria to improve predictors of drug efficacy and toxicity80. We foresee a time when such data will be used to direct allocation of resources as well as establish appropriate diagnostic, therapeutic and prevention guidelines in malaria control programmes.

Drug development and new therapeutics

Advances in proteomics will be crucial in identifying differential expressed proteins, which along with comparative genomic analysis, utilization of protein interaction maps and an understanding of metabolic pathways, will help identify and prioritize targets for therapeutic intervention (Table 3, Fig. 2). The ability to target regulatory regions as a therapeutic strategy may now be feasible given developments in computational and experimental approaches to identify potential regulatory regions in P. falciparum genes30, and highly selective methodologies that are able to target these highly AT-rich segments81.

Recent advances in gene-targeting technologies to manipulate P. falciparum genes as well as those in other rodent Plasmodium species will significantly aid target validation82,83. Of immediate interest will be Plasmodium proteins that are amenable to interrogation using currently available small-molecule design platforms and target-family chemical libraries, in addition to being sufficiently divergent or distinct from their human homologue. Structure-based modelling, the identification of critical catalytic and substrate-recognition motifs, polymorphism detection to target invariable regions, and the ability to predict determinants of drug resistance will provide the robust framework to rationalize and streamline the development of the next generation of therapeutics. Knowledge can be gleaned from the recent success in developing small-molecule inhibitors of the Trypanosome cruzi cysteine proteases84, and the P. falciparum homologues might also be attractive targets22. Equally tantalizing are the metacaspase family of cysteine proteases23 that may provide a new class of drug targets (absent in humans) that are amenable to high-throughput screening and the rational drug design process.

An integrated, systems-biology approach in malaria

There are numerous views on how to solve the problems of malaria63,85. We have outlined our perspective on how genomics and genomics-dependent science will be used to develop antimalarial vaccines; reduce or eliminate contact between humans and infective mosquitoes; rapidly diagnose and recognize those at greatest risk for developing severe disease; and identify and treat multidrug-resistant malaria. However, we have not discussed the integration of these efforts, or their integration with complementary fields of biomedicine.

The availability of genomic sequence data, use of sequence-based, high-throughput technologies, and advances in bioinformatics to analyse and interpret genomic data will ultimately provide an integrated picture of malarial biology, pathogenesis and epidemiology. We anticipate a time when we will routinely measure real-time biological profiles, including molecular phenotypic expression and genetic polymorphisms of the vector, host and parasite, and clinical outcomes (Table 3, Fig. 2), to create molecular kinetic portraits. Such complex biological systems have properties that cannot be predicted a priori. The development of mathematical modelling methods to describe biological systems at the molecular and/or population level, and predict behaviours of these systems in response to various virtual treatments, will be crucial in developing a new generation of interventions to attack malaria. The promise of a genome-sequence-based platform thereby lies in providing an integrated reconstruction of the spectrum of molecular and cellular interactions among parasite, vector and human host, with the ultimate goal of eradicating malaria.