It is now widely believed that almost four billion years ago, before the first living cells, life consisted of assemblies of self-reproducing macromolecules. The molecular candidate thought to mediate this activity was RNA, which can combine the necessary properties of encoding information and catalysing chemical reactions — functions that are now fulfilled largely by DNA and proteins, respectively. From theoretical arguments, it can be expected that a system of interacting molecules will give rise to complex, and even life-like, behaviour, but there is still debate about whether RNA was the first or the only macromolecule to participate in such activity, with both protein and DNA (or any combination with or without RNA) representing alternatives.

Circumstantial evidence for the central position of RNA in the origin of life can be found in ‘relic’ pieces of RNA that hold a few of the most important functions in the cell. Perhaps the most convincing observation is that, in the synthesis of proteins on the ribosome, the key chemical event — peptide-bond formation — is catalysed solely by RNA, suggesting that primacy lies with RNA rather than protein. A major impediment to full acceptance of an ‘RNA world’ is that, although it can easily be imagined that a pure RNA machine (a proto-ribosome) can make proteins, there is no equivalent RNA machine to make RNA (a ribopolymerase). All the RNA we know is made by protein, leading to perhaps the original ‘chicken-and-egg’ problem of which came first.

Some mechanisms for replication in the RNA world have been put forward, and following the current systems of protein polynucleotide synthesis, all involve the creation of a complementary daughter strand using Watson–Crick base-pairing. But from a mechanistic viewpoint, such a model contains a fundamental problem: if a ribopolymerase were to make a complementary copy of itself, it would need to recopy this to obtain a new functional ribopolymerase. This implies that both the ribopolymerase sequence and its complement would have to coexist. But if these two copies came together, the result would be a double stranded Watson–Crick helix (as found in some RNA viruses) — not a new ribopolymerase. Even if both sequences had well determined secondary structures, the perfect complementarity of the Watson–Crick pairing would act as a sink, leading to a sterile population of double-stranded molecules.

Credit: W. R. TAYLOR

In a world without any other type of molecule (such as protein) to prevent these unwanted interactions, it might be concluded that a pure RNA world could not have been viable. But what if the ribopolymerase did not synthesize a complementary strand? From a chemical viewpoint, there is no reason why a polymerase must make a complementary strand that runs in the reverse direction to the template strand. In modern protein polymerases, nucleotide triphosphates are added to the 3′ end of the transcript without the direct participation of the template strand. If the template strand was flipped (making a parallel complement), then all that would be lost is some capacity for the template and transcript to remain base-paired, as parallel nucleic acid strands cannot form a duplex with Watson–Crick base-pairing. In an RNA world, the loss of this interaction would be an advantage — preventing the formation of a dead-end double helix.

Starting replication at the 3′ end of the template strand, a transcript cannot be recopied until it is completed. In an RNA world, this strategy would leave a full-length transcript exposed to a hostile environment in which it might make many spurious interactions with other RNA molecules (especially its own complement) and, unless it quickly adopted a compact conformation, it would be susceptible to hydrolysis. By contrast, a polymerase that begins at the 5′ end produces a transcript on which retranscription can start almost immediately, minimizing the exposure of single-stranded RNA and avoiding hybridization. This strategy is similar to the immediate translation of messenger RNA in bacteria, which allows thermophilic species to minimize the exposure time of the single-stranded message at high temperatures. Immediate retranscription would also confer a similar protection in a hotter primeval world. Two ribopolymerases operating in tandem as a dimer could take a ribopolymerase sequence as a template and produce a new ribopolymerase. However, the intermediate parallel-complementary strand need not be discarded, and could be picked up by another ribopolymerase, potentially leading to a vast network of linked replication.

A difference in transcription direction may also explain why the RNA world eventually had to become extinct. When life progressed to a compartmentalized, genome-based system, it would be necessary to have a break in transcription to allow the physical separation of the messages. If each new copy were to be immediately retranscribed, a continual cascade of transcription would ensue. Although this may be efficient in a ‘soup’, it gives no defined point for the separation of individual genomes. If proteins emerged alongside such a system adopting the ‘parasitic’ role suggested by Freeman Dyson, those that gained some capacity to synthesize a complementary strand would not only create genomes that could be compartmentalized into cells, but the complementary strands would hybridize with the ribopolymerases, hastening their decline. The origin of protein polymerases may therefore have driven the ascent of cellular genomes and the eventual end of the RNA world.