Main

Despite massive efforts to eradicate malaria in the 1950s and early 1960s, today there are more humans infected with malaria than at any other time in history. More than 500 million people are infected with malaria worldwide, and one fourth of the world's population is at risk of infection. Furthermore, at least 2.5 million children die each year of malaria, most of them in Africa. Those children surviving chronic infection suffer a combination of anemia and immune suppression that leaves them vulnerable to other fatal illnesses. Drug resistance in the parasite is now widespread, further compromising prevention and treatment strategies. Malaria has confounded some of the best minds of this century. A hundred years after the discovery that mosquitoes transmit malaria, we still do not know enough about the disease to defeat it permanently1. However, a principal milestone in the field of malaria research has now been reached with the recent Science paper by Gardner et al.2 that presents the full-length sequence of one of the 14 chromosomes of Plasmodium falciparum, the organism that causes the most deadly form of human malaria. This sequencing achievement is the first for an organism as complex as P. falciparum and is particularly notable because it was feared that the intensely AT-rich Plasmodium genome would prove intractable to even the most sophisticated sequencing technology.

The Malaria Genome Sequencing Project is a cooperative effort between three sequencing centers—The Institute for Genomic Research (TIGR), the Sanger Center and the Stanford Genome Center—and the malaria research community with the goal of sequencing the entire genome of P. falciparum. It is sponsored by the National Institutes of Health, the Burroughs Wellcome Fund, the U.S. Department of Defense and the Wellcome Trust, and seeks to define every gene in the organism and, armed with this information, to identify new targets for drug and vaccine development. The sequence of chromosome 2, reported by the TIGR and Department of Defense groups2, is proof that it will be possible to achieve this goal.

The P. falciparum genome presents the sequencing 'gurus' with a number of technical hurdles to overcome. Most prominent of these is its remarkably high A+T content (80.2 percent in chromosome 2). The sequencing of chromosome 2 required the investigators to adapt standard cloning techniques, modify computer programs that assemble the sequence, and develop new methods to finish the sequence analysis. The entire genome is approximately 30 Mb in length and is arranged in 14 individual chromosomes that vary in length between wild-type isolates.

Chromosome 2 is just under 1 Mb in length and contains 209 predicted open reading frames. By comparing gene sequences in the parasite with those of other organisms, the investigators were able to identify homologues for about 40 percent of the genes, with the remaining 60 percent falling into the class of unidentified open reading frames. This percentage of unidentified open reading frames is substantially higher than that reported in other organisms. The authors suggest that this may be due, in part, to the unusually high percentage of proteins with non-globular domains predicted from the open reading frames of P. falciparum chromosome 2. (The function and importance of these non-globular protein domains is unknown, although the investigators suggest that they may provide a selective advantage to the parasite.) An additional contributory factor is certainly that the malaria parasite has an unusual and complex life cycle, existing in multiple forms in both the human host and mosquito vector. Host–parasite interactions and immune evasion strategies require unique genes and proteins.

One of the most interesting features of chromosome 2 is the arrangement of families of related genes (see Fig.). At each end of the chromosome, there are two families of genes: the var genes, which encode variant parasite antigens expressed on the surface of infected erythrocytes, and the rifin genes encoding a previously cloned gene family of unknown function named for the Rif-1 repeat element3. The arrangement of the rifin genes interspersed with the variant surface antigen genes, coupled with the discovery that the rifin genes, have signal sequences, suggests that the products of these genes are expressed on the erythrocyte surface and may be associated with variant surface antigens. Neither the spatial relationship of these gene families nor the juxtaposition of the signal sequences could have been determined using either expressed sequence tag approaches or standard mapping techniques.

One of the immediate applications of the Malaria Genome Sequencing Project is the identification of new potential targets for drug development, and the chromosome 2 sequence yields several new candidates. Of particular interest are enzymes present in the malaria parasite that are drug targets in other organisms and enzymes that are unique to the parasite and not found in the host. One example from chromosome 2 that seems promising is the 3-ketoacyl-ACP synthase III (FabH), an enzyme in the Type II fatty acid synthase system that had been thought to be restricted to bacteria and plant plastid systems only. Additional research will be needed to determine whether such putative targets are indeed essential for parasite survival and whether they are amenable to drug development. The sequence gives us direct access to the genes and the research can move directly to the focused effort of target evaluation. Further comparison of the P. falciparum genome with those of related organisms will provide leads to conserved genes of unknown function that may play an essential part in parasite survival.

Information from the genome project may also help scientists to develop new DNA-based vaccines (see page 1351 of this issue). Recent work from Hoffman's group demonstrates that a DNA vaccine based on the gene encoding the sporozoite surface protein elicits an immune response in humans4. New genes encoding putative surface proteins, such as the Rif-1 gene family, have been identified on chromosome 2 and some of these may warrant further investigation as potential vaccine targets. Furthermore, as new protein domains important in cell-cell interactions are identified through biological and immunological analysis, related genes can be readily sought in the sequence database. This will provide immediate information regarding variant and conserved regions that will be useful for vaccine design.

Sequencing and assembly of the remaining thirteen chromosomes of P. falciparum is underway, with about 40 percent of the 6,500 predicted genes at least partially sequenced. This now permits us to think about future applications that will be possible when the entire genome sequence becomes available. Of particular interest is the analysis of gene expression at the level of the entire organism through the use of DNA microarray5,6 or Serial Analysis of Gene Expression strategies7. These approaches have proved extremely powerful in the analysis of Saccharomyces cerevisiae gene expression, and have led to the identification of large groups of genes that respond when yeast are grown under different conditions or in the presence of drugs8. It seems reasonable to assume that a similar multigene response will occur in P. falciparum after exposure to new vectors, drugs or the host immune response. Previous work has focused on single genes; the availability of the tools provided by the P. falciparum Genome Sequencing Project will now allow us to take a more integrative approach, analyzing the interaction of several genes in response to a single stimulus. The ability to dissect these kinds of complex responses is likely to bring a whole new dimension to our understanding of the parasite.

Finally, availability of the chromosome 2 sequence will allow for comparison of different populations and isolates of the parasite and the determination of sequence polymorphisms and allelic diversity9. Comparison of sequences from recent parasite isolates with the reference strain sequence may provide insight into mechanisms of pathogenicity and virulence. Comparison of gene expression profiles should also provide clues to those genes that are essential for growth in the human host but are not required for growth in in vitro culture.

Genomic research promises new insight into mechanisms by which the malaria parasite operates and might be defeated. The challenge for the malaria research community is to bring this knowledge to the treatment and prevention of this deadly disease. Every twelve seconds, a child dies of malaria and the hope that this project brings is that the new knowledge created will lead to a fundamental and lasting change in that statistic.

Chromosome 2 of Plasmodium falciparum. The availability of the complete sequence of chromosome 2 of P. falciparum, the parasite causing the fatal form of human malaria, should enable the identification of new targets for drug and vaccine development. The var and rifin genes, arranged in subtelomeric repeats at both ends of the chromosome, may be potential drug or vaccine targets, particularly because the var genes encode (and the rifin genes are thought to encode) parasite antigens that are expressed on the infected erythrocyte surface.