<@include file="/horizon/includes/leftnav_rna_background.html"> <@include file="/horizon/includes/leftnav_logos.html">
DNA and RNA are biopolymers constructed from nucleotides
Background reading

printable pdf

Understanding the RNAissance

Joachim Pietzsch

"Who could have guessed it?" Oswald Avery asked his brother Roy in a letter that he completed long after midnight on 26 May 1943. In the weeks leading up to posing this question, Avery — who was coming up to his retirement as a professor at Rockefeller University — together with his colleagues Colin McCarty and Maclyn MacLeod, had made an incredible discovery: the long-sought-after carrier of genetic information was deoxyribonucleic acid (DNA). Nobody had previously thought that this chemically inert and seemingly boring molecule could have this role. Even Avery was still in doubt, and promised to carry out further analyses before publishing his results. 'It's a lot of fun to blow bubbles — but it's wiser to prick them yourself before someone else tries to,' he wrote. However, Avery's discovery proved to be correct, and opened up a long-hidden gateway to new knowledge of the secrets of life. Erwin Chargaff, who helped pave the way to the elucidation of the structure of DNA, was filled with enthusiasm, saying, 'I saw darkly outlined before me the start of a grammar of biology'.

Researchers were now able to deduce the molecular language of life from DNA. Its structure — the double helix discovered by James Watson and Francis Crick in 1953 — provided important clues about its function. In 1957, Crick postulated the central dogma of molecular biology, which stated that genetic information only ever flows in one direction, from DNA to proteins via an intermediate called ribonucleic acid (RNA). DNA, the carrier of genetic information, therefore became the queen of biology, with RNA lurking in her shadow like a faithful servant. But things have changed since then. The queen no longer seems to have everything so tightly under her control, and as we learn more about the many tricks that RNA has up its sleeves, the faithful servant is increasingly recognized as being a powerful advisor to the queen. So, at the beginning of the 21st century, the burgeoning appreciation of the important roles of RNA means that biology is experiencing a renaissance, and extending its grammatical manoeuvrability.

DNA, RNA and the central dogma
DNA and RNA are biopolymers constructed from nucleotides. A nucleotide is made up of a sugar molecule (called ribose), a phosphate group and nitrogen-containing base (
Fig. 1). DNA differs from RNA in having an hydrogen (H) group instead of an oxygen and hydrogen (OH) group on the 2� position of the sugar (which is where the 'deoxyribo' part of DNA's name comes from). In both DNA and RNA, sugar and phosphate form the structural backbone of the molecule. The bases found in DNA are called adenine (A), thymine (T), cytosine (C) and guanine (G), and in RNA, thymine (T) is replaced by uracil (U). The double helix of DNA fits together like two halves of a twisted ladder, the rungs being formed by bonds between specific pairs of bases, with A always pairing with T and C always pairing with G (Fig. 2). As such, each strand of DNA has an exactly defined complementary strand. Although DNA forms long, coiled double strands, RNA molecules are usually single stranded. For this reason, RNA can fold up in many ways, making it more structurally versatile than the relatively cumbersome DNA.

 


Figure 1 | The chemical structure of DNA.
a | The chemical structure of the nucleotide adenine triphosphate (ATP). Nucleotides are made up of a sugar molecule called ribose, a phosphate group and nitrogen-containing base (in this case adenine (A)). RNA differs from DNA by having an -OH group at position 2 of the ribose molecule, instead of DNA's -H group. b | The nucleotides bind to each other through their ribose and phosphate groups, forming the structural backbone of the DNA or RNA molecule. This leaves the bases free to pair with their partner from another strand.
Modified with permission from Ussery, D. W. DNA structure: A-, B- and Z-DNA helix families. In Encyclopedia of Life Sciences (Nature Publishing Group, London, 2001) © Macmillan Magazines Ltd.

 
 


Figure 2 | The double helix.
A DNA molecule, showing how the two strands attach to each other. The backbone of each strand is formed by ribose sugar (S) and phosphate group (P) bonds between the nucleotides. The bases form hydrogen bonds with their specific partner on the opposite strand; adenine (A) binds with thymine (T) and cytosine (C) binds with guanine (G).
Reproduced with permission from Judd, B. H. Nucleic acids as genetic material. In Encyclopedia of Life Sciences (Nature Publishing Group, London, 2001) © Macmillan Magazines Ltd.

 

Biochemists still had little idea of the variable nature of RNA as they built on Avery's discovery to open up the field of molecular biology. Over the 20 years following the identification of DNA as the carrier of genetic information, the structure of DNA was determined and the genetic code was cracked. The various RNAs involved in the conversion of genetic information into functioning proteins — ribosomal RNA (rRNA), transfer RNA (tRNA) and messenger RNA (mRNA) — were also discovered during this period. Today, these RNAs are regarded as fitting with Crick's central dogma of molecular biology, which is still a foundation of biology teaching in schools.

According to this orthodox view, the blueprints for building all proteins are stored in DNA in the form of genes. The code of every gene comprises a specific sequence of the bases A, C, G and T. Sets of three of these letters — called base triplets, or codons — define a particular amino acid, the building blocks of proteins, in the corresponding protein product. To enable the genes to be read and produce proteins, the double helix opens like a zipper. One strand of DNA acts as a template so that, with the aid of enzymes, an mRNA molecule can be created from the information in the gene. This is called transcription and occurs in the nucleus. The mRNA then passes out of the cell nucleus and into the cytoplasm, where it heads towards the protein factories in the cell, called the ribosomes. Here, the mRNA meets tRNA molecules, each of which carries an amino acid ready to be incorporated into an emerging protein. For each codon in DNA there is a corresponding tRNA that carries the relevant amino acid. Protein synthesis then takes place at the ribosomes by the sequential addition of amino acids, according to the sequence of codons in the mRNA. This is called translation.

The beginning of the RNAissance
This orthodox view is not wrong, but it is no longer complete. It reflects what was known at the start of the 1970s. Then, something happened that revolutionized molecular biology research; the development of genetic engineering. This permitted production of proteins in the laboratory, which previously could only be synthesized with difficulty, if at all. All that was necessary was to implant the gene for the desired protein into a microorganism, and let it use its machinery to synthesize this foreign protein. The organism was then grown to produce large quantities of the protein, which could subsequently be harvested. This did not just lead to genetically engineered drugs; it also opened up undreamt of possibilities in biological research. It was now possible to reproduce many previously inaccessible structures, so they could be studied in detail in the laboratory. That applied not just to proteins, but also to the RNAs formed by transcription.

In 1977 a number of research groups discovered that the genes of higher organisms are often made up of a sequence of coding and non-coding base sequences. This was a great surprise, as the genes of bacteria — which they had worked with up to that point — were all made up of a single sequence that fully coded for a protein. However, analysis of the gene coding for the b-chain of the blood pigment haemoglobin in the mouse revealed that it contained three coding sequences (called exons) separated by two non-coding sequences (called introns). During transcription, all parts of the gene are first copied to form a strand of pre-mRNA. Only then are the introns removed with the aid of enzymes and the exons stitched together so that the now continuous exons can be translated to produce a protein. This splicing of the pre-mRNA is a multistage process, carried out by a complex of small RNA molecules and proteins known as the spliceosome (Fig. 3).

 


Figure 3 | A simplified view of the splicing process.
The process of splicing involves several RNA-protein complexes, called small nuclear ribonucleoproteins (snRNPs), which together make up the spliceosome. Splicing occurs in several stages. U1 snRNP binds to the boundary between exon 1 and the intron by recognizing a specific sequence. U2 subsequently binds to the branch site (A) and then U4/U5/U6 triple snRNPs join in. After a dynamic rearrangement, U1 and U4 are destabilized, and the remaining snRNP complex is activated for the two steps that remove the intron and stitch together exons 1 and 2. Before the 1980s, researchers presumed that the protein components of the snRNPs were acting as the catalytic enzymes in this process — now we know that some of the RNAs are the important catalytic components.
Reproduced with permission from Gu, J. & Reddy, R. Cellular RNAs: varied roles. In Encyclopedia of Life Sciences (Nature Publishing Group, London, 2001) © Macmillan Magazines Ltd.

 

Certain small RNA molecules in the spliceosome act as enzymes. That is a simple statement, but any researcher who said that in the 1970s would have been declared insane. According to everything that was known at the time, only proteins could act as enzymes. However, in the early 1980s Tom Cech and Sidney Altman discovered that the spliceosome contains RNA molecules that can catalyse a chemical reaction, namely the cleavage of a bond between the sugar ribose and a phosphate group. It became clear that RNA is not just a passive carrier of genetic information, but also plays an active role in biological processes. This discovery was a sensation, and was rewarded with the Nobel prize for chemistry in 1989. RNAs with catalytic activity are called ribozymes, and we now know that ribozymes probably catalyse a large number of key reactions in the cell.

The RNA world
The discovery of ribozymes led to the hypothesis that RNA could have been the original molecule of life on earth about four billion years ago; a biopolymer with the ability to self-replicate that had developed from prebiotic organic molecules and that could both store information and catalyse chemical reactions. Until ribozymes were discovered, it had not been possible to fully explain the origin of life and evolution on the basis of DNA or proteins. DNA is the carrier of genetic information that contains the blueprints for proteins, but it can only be formed and have an effect with the aid of enzymes, which are proteins. Proteins, on the other hand, are the end product of the flow of genetic information that begins with DNA; how, then, could either have originated? DNA cannot exist without proteins, and proteins cannot exist without DNA. This chicken-and-egg situation could be answered, however, by examining RNA. RNA would have been self-sufficient as the original molecule of life; an information store that could read itself and convert the information into biological functions. It is no wonder that it stirred the imagination of a growing number of researchers.

In the first years of the post-genomic era, RNA has increasingly become a focus of interest. This is partly due to new discoveries in the field of RNA, but even more so to the fundamental insight revealed by the analysis of the sequence of the human genome in 2001. This landmark study found that the number of genes we have is around 30,000, much less than was previously predicted. As we have a much greater number of proteins, the findings implied that the number of genes per se is of lesser importance in generating protein diversity compared with the way in which the genes are converted into proteins. Human beings have around twice as many genes as the fruitfly. But the tunes played on the keyboard of these genes, and the scores that can be composed and orchestrated using its simple sequences of 'notes', are strikingly different, and explain human complexity. Human genes are multicoded, and each, on average, encodes the instructions for making 10-12 proteins, alternative splicing patterns being one common explanation for this. The human genome sequence itself has not produced the insight into the nature of humanity that was hoped for. Instead, it has increased awareness of a new dimension of complexity on the route between transcription and translation. Less than 2% of the 3.2 billion bases in the human genome code for proteins. In bacteria, by contrast, the vast majority of the genetic information relates directly to the production of proteins. Does this suggest that the reason for the 'superiority' of humans lies in the 98% of the genome that does not code for proteins? Is it not conceivable that the end products of many mammalian genes are not proteins, but RNAs? As is now known, even in the yeast Saccharomyces cerevisiae, more than 10% of all genes code for RNA as an end product, not as an intermediate. Is it conceivable that certain RNA molecules are the actual creators and controllers of life?

The pursuit of answers to these questions has led to hypotheses that are highly controversial. Nevertheless, more and more new discoveries are being made that might revolutionize our concept of the role of RNA. The crystal structure of the ribosome was published in 2000, and shows that the actual site of protein synthesis functions completely without proteins. It is composed solely of RNA molecules, though, admittedly, their precise function is still unclear.

RNA interference
Recently, another facet of RNA has been uncovered, in the form of RNA interference (RNAi), an apparently ancient defence mechanism against foreign genetic information. Short double-stranded RNA molecules of just 22 nucleotides in length, called small interfering RNAs (siRNAs), are snipped from longer chains by an enzyme called Dicer (Fig. 4). These siRNAs interfere with the translation of sections of mRNA from which they themselves were created, and cause their destruction. Invading RNA viruses, for example, induce RNAi in a similar manner to an immune reaction, so that there is competition between replication of the invading RNA and its destruction by interference. Recent evidence has shown that siRNAs might also influence transcription in the cell nucleus, partly by determining the form of the chromatin in which the DNA is wrapped. So, which genes are transcribed and which remain wrapped up or hidden could be under the control of small RNAs. RNAi has now been developed into an excellent tool for genome research. By selecting small double-stranded RNAs of the right length under defined laboratory conditions and injecting them into cells, it is possible to switch off specific mRNAs in a targeted manner, and much more quickly and cheaply than was previously possible.

 


Figure 4 | RNA interference.
RNA interference (RNAi) is an apparently ancient defence mechanism against foreign double-stranded RNA (dsRNA). RNAs of just 22 nucleotides in length, called small interfering RNAs (siRNAs), are snipped from longer dsRNA chains by an enzyme called Dicer. The antisense strand of the siRNA is used by an RNA interference silencing complex (RISC) to guide messenger RNA (mRNA) cleavage, so promoting mRNA degradation.
Modified with permission from McManus, M. T. & Sharp, P. A. Gene silencing in mammals by small interfering RNAs. Nature Rev. Genet. 3, 737-747 (2002) © Macmillan Magazines Ltd.

 

The method of RNAi was discovered by Tom Tuschl in 1998, who is now carrying out research at the Rockefeller University in New York, where Oswald Avery once worked. So, the circle is complete. In 1943, when Avery and his colleagues identified DNA as the carrier of genetic information, the 'Four quartets' by T. S. Eliot appeared as a single volume for the first time. One of the best-known verses from this cycle was cited by the authors of the human genome project at the end of their article in Nature on 15 February 2001: 'We shall not cease from exploration/And the end of all our exploring/Will be to arrive where we started/And know the place for the first time.' One might interpret this verse as an allegory to research in molecular biology at present. Our search for an explanation for life leads us forwards to our roots, because we encounter the molecule that might have started everything: RNA — who could have guessed it?

printable pdf

 
   
<@include file="/horizon/includes/footer_2003.html">