The origin of the current AIDS pandemic has been a subject of great interest and speculation. Viral archaeology sheds light on the geography and timescale of the early diversification of HIV-1 in humans.
Human immunodeficiency virus type 1 (HIV-1) must have been spreading through the human population long before AIDS was first described in 1981, but very few strains from this 'prehistoric' period (pre-1980s) have been characterized. Viral sequences from earlier times can provide insight into the early spread of HIV-1, because the rapid rate of evolution of this virus — up to a million times faster than that of animal DNA — means that substantial amounts of sequence change occur in a matter of decades1. On page 661 of this issue, Worobey et al.2 describe the sequences of partial genome fragments of HIV-1 from a lymph-node biopsy collected in 1960 in Léopoldville (now Kinshasa, Democratic Republic of the Congo). They compare these sequences with those of other HIV-1 strains, shedding light on the early evolution and diversification of this virus in Africa.
HIV-1 strains are divided into three groups, each of which was independently derived from a simian immunodeficiency virus (SIV) that naturally infects chimpanzees in west-central Africa3. Whereas two of these groups are rare, the third, group M, has spread throughout the world and is the cause of more than 95% of HIV infections globally. Group M can be further divided into many subtypes (A–K), which seem to have arisen through founder events. For example, subtype B, which encompasses all the strains originally described in North America and Europe, is very rare in Africa, and reflects such a founder event. Last year, Worobey and colleagues showed4 that this subtype probably arose from a single strain that was carried from Africa to Haiti before spreading to the United States and onwards. The newly described2 1960 virus (DRC60) falls within, but close to the ancestor of, subtype A.
DRC60 is not the first 'ancient' HIV-1 sample to be characterized: viral sequences from a blood-plasma sample originally obtained in 1959 — also from Léopoldville — were published 10 years ago5. The importance of DRC60 is that it is highly divergent from the 1959 sample (ZR59), which was most closely related to the ancestor of subtype D, thus directly demonstrating that, by 50 years ago, group M HIV-1 strains had already undergone substantial diversification.
The ZR59 and DRC60 sequences differ by about 12%, a value similar to distances now seen between the most divergent strains within subtypes. As the positions of ZR59 and DRC60 within the group M phylogeny indicate that the various subtypes already existed 50 years ago, simple extrapolation suggests that these two viral sequences had a common ancestor at least 50 years before that. For a more robust estimate of the date of the common ancestor of HIV-1 group M strains, Worobey and colleagues used state-of-the-art statistical analyses, allowing a variety of models for the growth of the HIV-1 pandemic and variable rates of evolution. The different analyses gave broadly similar estimates for the date of that common ancestor, between 1902 and 1921, with 95% confidence intervals ranging no later than 1933. These dates are a little earlier than, but do not differ significantly from, a previous estimate1 of 1931 from an analysis that did not include the 50-year-old viruses.
The interpretation that HIV-1 was spreading among humans for 60–80 years before AIDS was first recognized should not be surprising. If the epidemic grew roughly exponentially from only one or a few infected individuals around 1910 to the more than 55 million estimated to have been infected by 2007, there were probably only a few thousand HIV-infected individuals by 1960, all in central Africa. Given the diverse array of symptoms characteristic of AIDS, and the often-long asymptomatic period following infection, it is easy to imagine how the nascent epidemic went unrecognized. Conversely, such a low prevalence at that time implies that the Congolese co-authors of the paper2 were very lucky to come across this infected sample, even if most infections were concentrated in the area of Léopoldville. But can we trust these sequences?
In work on ancient DNA, contamination is especially problematic, and the work should, if possible, be replicated in other laboratories. For DRC60, independent analyses were performed at the University of Arizona and Northwestern University, Illinois. The sequences obtained were similar, but not identical, exactly as expected when samples come from the diverse set of related viral sequences that — because of the virus's rapid rate of evolution — arise within an infected individual6. Furthermore, the distance along the evolutionary tree from the group M ancestor to the ZR59 or DRC60 sequences is much shorter than those between the ancestor and modern strains, consistent with the earlier dates of isolation of ZR59 and DRC60, and confirming that these viruses are indeed old.
Although the ZR59 and DRC60 sequences can show only that two subtypes were present in Léopoldville around 1960, in more recent times the greatest diversity of group M subtypes — as well as many divergent strains that have not been classified — has been found in Kinshasa7. So it seems likely that all of the early diversification of HIV-1 group M viruses occurred in the Léopoldville area. Yet the SIV strains most closely related to HIV-1 group M have been found infecting chimpanzees in the southeast corner of Cameroon3, some 700 kilometres away (Fig. 1a). The simplest explanation for how SIV jumped to humans would be through exposure of humans to the blood of chimpanzees butchered locally for bushmeat. So why did the pandemic start in Léopoldville? And, as there must have been many opportunities for such transmission over past millennia, why did the AIDS pandemic not occur until the twentieth century?
The answer may be that, for an AIDS epidemic to get kick-started, HIV-1 needs to be seeded in a large population centre. But cities of significant size did not exist in central Africa before 1900. Worobey and colleagues2 reproduce demographic data showing the rapid growth of cities in west-central Africa during the twentieth century. Léopoldville was not only the largest of these cities, but also a likely destination for a virus escaping from southeast Cameroon. In the early 1900s, the main routes of transportation out of that remote forest region were rivers; those surrounding this area flow south, ultimately draining into the Congo River, and leading to Léopoldville (Fig. 1).
The date estimates of Worobey et al. are for an ancestral virus, present in the first individual to give rise to separate transmission chains that still exist today. We may never know how many individuals were infected in the previous transmission chain, the one that led from the person initially infected with SIV to the progenitor of the current pandemic in humans. This exception aside, we can now paint a remarkably detailed picture of the time and place of origin of HIV-1 group M viruses and their early diversification, and thus of the prehistory of the AIDS pandemic.
Korber, B. et al. Science 288, 1789–1796 (2000).
Worobey, M. et al. Nature 455, 661–664 (2008).
Keele, B. F. et al. Science 313, 523–526 (2006).
Gilbert, M. T. P. et al. Proc. Natl Acad. Sci. USA 104, 18566–18570 (2007).
Zhu, T. et al. Nature 391, 594–597 (1998).
Meyerhans, A. et al. Cell 58, 901–910 (1989).
Vidal, N. et al. J. Virol. 74, 10498–10507 (2000).
About this article
Subtype-Specific Differences in Gag-Protease-Driven Replication Capacity Are Consistent with Intersubtype Differences in HIV-1 Disease Progression
Journal of Virology (2017)
Global Health Research and Policy (2017)
Molecular and functional interactions of cat APOBEC3 and feline foamy and immunodeficiency virus proteins: Different ways to counteract host-encoded restriction
Functional Ecology (2012)