DNA Transcription

By: Suzanne Clancy, Ph.D. © 2008 Nature Education
Citation: Clancy, S. (2008) DNA transcription. Nature Education 1(1)

If DNA is a book, then how is it read? Learn more about the DNA transcription process, where DNA is converted to RNA, a more portable set of instructions for the cell.

 

The genetic code is frequently referred to as a "blueprint" because it contains the instructions a cell requires in order to sustain itself. We now know that there is more to these instructions than simply the sequence of letters in the nucleotide code, however. For example, vast amounts of evidence demonstrate that this code is the basis for the production of various molecules, including RNA and protein. Research has also shown that the instructions stored within DNA are "read" in two steps: transcription and translation. In transcription, a portion of the double-stranded DNA template gives rise to a single-stranded RNA molecule. In some cases, the RNA molecule itself is a "finished product" that serves some important function within the cell. Often, however, transcription of an RNA molecule is followed by a translation step, which ultimately results in the production of a protein molecule.

Vizualizing Transcription

The process of transcription can be visualized by electron microscopy (Figure 1); in fact, it was first observed using this method in 1970. In these early electron micrographs, the DNA molecules appear as "trunks," with many RNA "branches" extending out from them. When DNAse and RNAse (enzymes that degrade DNA and RNA, respectively) were added to the molecules, the application of DNAse eliminated the trunk structures, while the use of RNAse wiped out the branches.

DNA is double-stranded, but only one strand serves as a template for transcription at any given time; the other strand is referred to as the noncoding strand. In most organisms, the strand of DNA that serves as the coding template for one gene may be noncoding for other genes within the same chromosome.

The Transcription Process

The process of transcription begins when an enzyme called RNA polymerase (RNA pol) attaches to the template DNA strand and begins to catalyze production of complementary RNA. Polymerases are large enzymes composed of approximately a dozen subunits, and when active on DNA, they are also typically complexed with other factors. In many cases, these factors signal which gene is to be transcribed.

Three different types of RNA polymerase exist in eukaryotic cells, whereas bacteria have only one. In eukaryotes, RNA pol I transcribes the genes that encode most of the ribosomal RNAs (rRNAs), and RNA pol III transcribes the genes for one small rRNA, plus the transfer RNAs that play a key role in the translation process, as well as other small regulatory RNA molecules. Thus, it is RNA pol II that transcribes the messenger RNAs, which serve as the templates for production of protein molecules.

Transcription Initiation

The first step in transcription is initiation, when the RNA pol binds to the DNA upstream (5′) of the gene at a specialized sequence called a promoter. In bacteria, promoters are usually composed of three sequence elements, whereas in eukaryotes, there are as many as seven elements.

In prokaryotes, most genes have a sequence called the Pribnow box, with the consensus sequence TATAAT positioned about ten base pairs away from the site that serves as the location of transcription initiation. Not all Pribnow boxes have this exact nucleotide sequence; these nucleotides are simply the most common ones found at each site. Although substitutions do occur, each box nonetheless resembles this consensus fairly closely. Many genes also have the consensus sequence TTGCCA at a position 35 bases upstream of the start site, and some have what is called an upstream element, which is an A-T rich region 40 to 60 nucleotides upstream that enhances the rate of transcription (Figure 2). In any case, upon binding, the RNA pol "core enzyme" binds to another subunit called the sigma subunit to form a holoezyme capable of unwinding the DNA double helix in order to facilitate access to the gene. The sigma subunit conveys promoter specificity to RNA polymerase; that is, it is responsible for telling RNA polymerase where to bind. There are a number of different sigma subunits that bind to different promoters and therefore assist in turning genes on and off as conditions change.

Eukaryotic promoters are more complex than their prokaryotic counterparts, in part because eukaryotes have the aforementioned three classes of RNA polymerase that transcribe different sets of genes. Many eukaryotic genes also possess enhancer sequences, which can be found at considerable distances from the genes they affect. Enhancer sequences control gene activation by binding with activator proteins and altering the 3-D structure of the DNA to help "attract" RNA pol II, thus regulating transcription. Because eukaryotic DNA is tightly packaged as chromatin, transcription also requires a number of specialized proteins that help make the coding strand accessible.

In eukaryotes, the "core" promoter for a gene transcribed by pol II is most often found immediately upstream (5′) of the start site of the gene. Most pol II genes have a TATA box (consensus sequence TATTAA) 25 to 35 bases upstream of the initiation site, which affects the transcription rate and determines location of the start site. Eukaryotic RNA polymerases use a number of essential cofactors (collectively called general transcription factors), and one of these, TFIID, recognizes the TATA box and ensures that the correct start site is used. Another cofactor, TFIIB, recognizes a different common consensus sequence, G/C G/C G/C G C C C, approximately 38 to 32 bases upstream (Figure 3).

The promoters of genes transcribed by RNA polymerase II consist of a core promoter and a regulatory promoter that contain consensus sequences.
Figure 3: The promoters of genes transcribed by RNA polymerase II consist of a core promoter and a regulatory promoter that contain consensus sequences.
Not all the consensus sequences shown are found in all promoters.

The terms "strong" and "weak" are often used to describe promoters and enhancers, according to their effects on transcription rates and thereby on gene expression. Alteration of promoter strength can have deleterious effects upon a cell, often resulting in disease. For example, some tumor-promoting viruses transform healthy cells by inserting strong promoters in the vicinity of growth-stimulating genes, while translocations in some cancer cells place genes that should be "turned off" in the proximity of strong promoters or enhancers.

Enhancer sequences do what their name suggests: They act to enhance the rate at which genes are transcribed, and their effects can be quite powerful. Enhancers can be thousands of nucleotides away from the promoters with which they interact, but they are brought into proximity by the looping of DNA. This looping is the result of interactions between the proteins bound to the enhancer and those bound to the promoter. The proteins that facilitate this looping are called activators, while those that inhibit it are called repressors.

Transcription of eukaryotic genes by polymerases I and III is initiated in a similar manner, but the promoter sequences and transcriptional activator proteins vary.

Strand Elongation

Once transcription is initiated, the DNA double helix unwinds and RNA polymerase reads the template strand, adding nucleotides to the 3′ end of the growing chain. At a temperature of 37 degrees Celsius, new nucleotides are added at the rate of about 15-20 amino acids per second in bacteria (Dennis & Bremer, 1974), while eukaryotes proceed at a much slower pace of approximately five to eight amino acids per second (Izban & Luse, 1992).

Transcription Termination

Termination by bacterial rho-independent terminators is a multistep process.
Figure 4: Termination by bacterial rho-independent terminators is a multistep process.

Terminator sequences are found close to the ends of coding sequences. Bacteria possess two types of these sequences. In rho-independent terminators, inverted repeat sequences are transcribed; they can then fold back on themselves in hairpin loops, causing RNA pol to pause and resulting in release of the transcript. On the other hand, rho-dependent terminators make use of a factor called rho, which actively unwinds the DNA-RNA hybrid formed during transcription, thereby releasing the newly synthesized RNA (Figure 4).

In eukaryotes, termination of transcription occurs by different processes, depending upon the exact polymerase utilized. For pol I genes, transcription is stopped using a termination factor, through a mechanism similar to rho-dependent termination in bacteria. Transcription of pol III genes ends after transcribing a termination sequence that includes a polyuracil stretch, by a mechanism resembling rho-independent prokaryotic termination. Termination of pol II transcripts, however, is more complex.

Transcription of pol II genes can continue for hundreds or even thousands of nucleotides beyond the end of a coding sequence. The RNA strand is then cleaved by a complex that appears to associate with the polymerase. Cleavage seems to be coupled with termination of transcription and occurs at a consensus sequence. Mature pol II mRNAs are polyadenylated at the 3′-end, resulting in a poly(A) tail; this process follows cleavage and is also coordinated with termination.

Both polyadenylation and termination make use of the same consensus sequence, and the interdependence of the processes was demonstrated in the late 1980s by work from several groups. One group of scientists working with mouse globin genes showed that introducing mutations into the consensus sequence AATAAA, known to be necessary for poly(A) addition, inhibited both polyadenylation and transcription termination. They measured the extent of termination by hybridizing transcripts with the different poly(A) consensus sequence mutants with wild-type transcripts, and they were able to see a decrease in the signal of hybridization, suggesting that proper termination was inhibited. They therefore concluded that polyadenylation was necessary for termination (Logan et. al., 1987). Another group obtained similar results using a monkey viral system, SV40 (simian virus 40). They introduced mutations into a poly(A) site, which caused mRNAs to accumulate to levels far above wild type (Connelly & Manley, 1988).

The exact relationship between cleavage and termination remains to be determined. One model supposes that cleavage itself triggers termination; another proposes that polymerase activity is affected when passing through the consensus sequence at the cleavage site, perhaps through changes in associated transcriptional activation factors. Thus, research in the area of prokaryotic and eukaryotic transcription is still focused on unraveling the molecular details of this complex process, data that will allow us to better understand how genes are transcribed and silenced.

References and Recommended Reading


Connelly, S., & Manley, J. L. A functional mRNA polyadenylation signal is required for transcription termination by RNA polymerase II. Genes and Development 4, 440–452 (1988)

Dennis, P. P., & Bremer, H. Differential rate of ribosomal protein synthesis in Escherichia coli B/r. Journal of Molecular Biology 84, 407–422 (1974)

Dragon. F., et al. A large nucleolar U3 ribonucleoprotein required for 18S ribosomal RNA biogenesis. Nature 417, 967–970 (2002) doi:10.1038/nature00769 (link to article)

Izban, M. G., & Luse, D. S. Factor-stimulated RNA polymerase II transcribes at physiological elongation rates on naked DNA but very poorly on chromatin templates. Journal of Biological Chemistry 267, 13647–13655 (1992)

Kritikou, E. Transcription elongation and termination: It ain't over until the polymerase falls off. Nature Milestones in Gene Expression 8 (2005)

Lee, J. Y., Park, J. Y., & Tian, B. Identification of mRNA polyadenylation sites in genomes using cDNA sequences, expressed sequence tags, and trace. Methods in Molecular Biology 419, 23–37 (2008)

Logan, J., et al. A poly(A) addition site and a downstream termination region are required for efficient cessation of transcription by RNA polymerase II in the mouse beta maj-globin gene. Proceedings of the National Academy of Sciences 23, 8306–8310 (1987)

Nabavi, S., & Nazar, R. N. Nonpolyadenylated RNA polymerase II termination is induced by transcript cleavage. Journal of Biological Chemistry 283, 13601–13610 (2008)


Flag Inappropriate

This content is currently under construction.

This reading is linked to the following Scitable pages:

How does the cell convert DNA into working proteins? The process of translation can be seen as the decoding of instructions for making proteins, involving mRNA in transcription as well as tRNA.
How did eukaryotic organisms become so much more complex than prokaryotic ones, without a whole lot more genes? The answer lies in transcription factors.
Not all genes are active at all times. DNA methylation is one of several epigenetic mechanisms that cells use to control gene expression.
How can a gene, consisting of a string of DNA hidden in the nucleus, know when it should express itself? How does the gene cause production of a string of amino acids called a protein?
This is the Nucleic Acid Structure and Function Topic Room at Scitable.
Alu elements have long been considered “junk” DNA--or, even worse, “selfish” DNA. Turns out, these prolific transposons are much more useful than originally thought.
How do bacteria adapt so quickly to their environments? Part of the answer to this question lies in clusters of coregulated genes called operons.
What exactly is the relationship between genes and proteins, and what part did black urine play in revealing this mysterious and crucial connection?
In eukaryotes, DNA is tightly wound into a complex called chromatin. Thanks to the process of chromatin remodeling, this complex can be "opened" so that specific genes are expressed.
Gene cloning and PCR allow scientists to make a large amount of DNA from only a small fragment. How do these technologies work?
The formation of new genes is a primary driving force of evolution in all organisms. How exactly do these new genes crop up in an organism’s genome and what must occur in order for them to be passed on?
How can scientists better understand the workings of a cell? Studying the transcriptome, RNA expressed from the genome, reveals a more complex picture of the gene expression behind it all.
If someone gave you a stranger’s complete genetic code, could you predict everything about that person? Of course not, but why isn't there one code to explain how everything works?
Three individuals carry the same disease-causing mutation; two suffer from the disease but exhibit different symptoms, while the third is completely unaffected. Why?
How can low oxygen levels lead to developmental abnormality and disease? It seems that proteins called hypoxia-inducible factors (HIFs) can affect gene expression in low-oxygen conditions.
All Articles Within Nucleic Acid Structure and Function (36)

DNA Replication (6)

  • DNA Replication and Causes of Mutation
    Cells employ an arsenal of editing mechanisms to correct mistakes made during DNA replication. How do they work, and what happens when these systems fail?
  • Major Molecular Events of DNA Replication
    Arthur Kornberg compared DNA to a tape recording of instructions that can be copied over and over. How do cells make these near-perfect copies, and does the process ever vary?
  • Semi-Conservative DNA Replication: Meselson and Stahl
    Watson and Crick's discovery of DNA structure in 1953 revealed a possible mechanism for DNA replication. So why didn't Meselson and Stahl finally explain this mechanism until 1958?
  • Genetic Mutation
    A single base change can create a devastating genetic disorder or a beneficial adaptation, or it might have no effect. How do mutations happen, and how do they influence the future of a species?
  • DNA Damage & Repair: Mechanisms for Maintaining DNA Integrity
    DNA integrity is always under attack from environmental agents like skin cancer-causing UV rays. How do DNA repair mechanisms detect and repair damaged DNA, and what happens when they fail?
  • Genetic Mutation
    Is it possible to have “too many” mutations? What about “too few”? While mutations are necessary for evolution, they can damage existing adaptations as well.

Transcription & Translation (4)

  • Translation: DNA to mRNA to Protein
    How does the cell convert DNA into working proteins? The process of translation can be seen as the decoding of instructions for making proteins, involving mRNA in transcription as well as tRNA.
  • DNA Transcription
    If DNA is a book, then how is it read? Learn more about the DNA transcription process, where DNA is converted to RNA, a more portable set of instructions for the cell.
  • RNA Transcription by RNA Polymerase: Prokaryotes vs Eukaryotes
    Gene expression is linked to RNA transcription, which cannot happen without RNA polymerase. However, this is where the similarities between prokaryote and eukaryote expression end.
  • What is a Gene? Colinearity and Transcription Units
    In 1958, Francis Crick’s sequence hypothesis finally provided an answer to the question: what is a gene? Why is this definition now considered overly simplistic?

Discovery of Genetic Material (4)

RNA (8)

  • RNA Functions
    The central dogma of molecular biology suggests that the primary role of RNA is to convert the information stored in DNA into proteins. In reality, there is much more to the RNA story.
  • RNA Transcription by RNA Polymerase: Prokaryotes vs Eukaryotes
    Gene expression is linked to RNA transcription, which cannot happen without RNA polymerase. However, this is where the similarities between prokaryote and eukaryote expression end.
  • Chemical Structure of RNA
    The more researchers examine RNA, the more surprises they continue to uncover. What have we learned about RNA structure and function so far?
  • RNA Splicing: Introns, Exons and Spliceosome
    What's the difference between mRNA and pre-mRNA? It's all about splicing of introns. See how one RNA sequence can exist in nearly 40,000 different forms.
  • What is a Gene? Colinearity and Transcription Units
    In 1958, Francis Crick’s sequence hypothesis finally provided an answer to the question: what is a gene? Why is this definition now considered overly simplistic?
  • Restriction Enzymes
    Restriction enzymes are one of the most important tools in the recombinant DNA technology toolbox. But how were these enzymes discovered? And what makes them so useful?
  • Genome Packaging in Prokaryotes: the Circular Chromosome of E. coli
    How do bacteria, lacking a nucleus, organize and pack their genome into the cell? Supercoiling enables this but forces a different kind of transcription and translation in prokaryotes.
  • Eukaryotic Genome Complexity
    How many genes are there? This question is surprisingly not very important, and has nothing to do with the organism’s complexity. There is more to genomes than protein-coding genes alone.

Gene Copies (5)

  • Copy Number Variation and Genetic Disease
    Did you know that a large number of your genes exist in variable numbers of copies? While they can overlap with disease-related genes, these variants exist in healthy individuals too.
  • DNA Deletion and Duplication and the Associated Genetic Disorders
    Deletions and duplications of single-base pairs typically arise during homologous recombination and cause diseases. But what happens when a mutation occurs over multiple genes?
  • Tandem Repeats and Morphological Variation
    All mammals have basically the same set of genes, yet there are obviously some significant differences that distinguish the various species. Recent research suggests that one such difference involves tandem repeats, or short lengths of DNA that are repeated multiple times within a gene. But what, if anything, does having a different number of tandem repeats do to an organism?
  • Copy Number Variation
    Copy number variations (CNVs) have been linked to dozens of human diseases, but can they also represent the genetic variation that was so essential to our evolution?
  • Copy Number Variation and Human Disease
    Analysis of individual human genomes has revealed an unexpected amount of variability in human populations. Copy number variation (CNV) has recently been identified as a major cause of structural variation in the genome, involving both duplications and deletions of sequences that typically range in length from 1,000 base pairs to 5 megabases, the cytogenetic level of resolution. Evidence is accumulating that CNVs play important roles in human disease.

Jumping Genes (4)

Applications in Biotechnology (4)

 
Ask an Expert
Post Question



Nature Education Home Learn More About Faculty Page Students Page Feedback



Genetics

Event Reminder