The twenty-first amino acid

In the early days of molecular biology, it was thought that only 20 amino acids are specified by the genetic code. Four ‘letters’, or nucleotides, are used in this code, which is read by grouping the letters into triplets. When the genetic code was cracked in the 1960s, each of the 64 possible combinations of three letters was thought to have just one use, either encoding one of the 20 amino acids or marking the start or end point of the read-out. But in 1986 it was discovered that under special circumstances the meaning of one of the ‘stop codons’ — the letters UGA — is changed to specify the twenty-first amino acid, selenocysteine. Only a few proteins have selenocysteine in their sequences, but this unusual amino acid often has a very specific chemical role. The mechanisms for encoding selenocysteine are proving very interesting. Tujebajeva and co-workers1 and Fagegaltier and colleagues2, writing in EMBO Reports and EMBO Journal, respectively, provide clues to how mammalian cells read ‘selenocysteine’ instead of ‘stop’.

First, let us remind ourselves of the background. To translate a gene into a protein, a cell first makes a working copy of the gene — a messenger RNA template. RNA– protein complexes called ribosomes then read along the template, deciphering what each nucleotide triplet means. They start at the triplet AUG, adding amino acids sequentially (brought in by transfer RNAs) as specified by the following triplets in the sequence, and stop when they reach a stop codon. Unless, that is, the stop codon is UGA and the mRNA template contains instructions that a selenocysteine residue should be inserted at that point.

The mechanism by which UGA is redefined to specify selenocysteine in the bacterium Escherichia coli was unravelled by Böck and colleagues3. Here, the key instruction is a special folded structure in the mRNA, a ‘stem loop’, immediately following the UGA. The loop of this stem loop binds a complex consisting of a protein, SelB, and the special transfer RNA that carries selenocysteine. SelB is an ‘elongation factor’, like the better-known EF-Tu, which brings to ribosomes the tRNAs that carry the standard amino acids. SelB, by contrast, is specific for the selenocysteine-bound tRNA. It also has a feature not seen in EF-Tu: a carboxy-terminal extension that is responsible for binding to the loop sequence of the stem loop, ensuring that only a UGA codon followed by the right stem-loop structure will have its meaning changed. So, the proximity of the UGA codon and the stem loop leads to a model in which the selenocysteine-bound tRNA is delivered, by SelB, directly to the UGA codon.

But in mammals the signal that redefines UGA is a more complicated stem-loop structure (called SECIS) that is located not next to the UGA codon, but far away, beyond the end of the coding sequence of the mRNA4 (Fig. 1). Attempts to understand how this system works focused first on trying to find a protein that binds to the SECIS region, on the assumption that such a protein might work in a similar way to the bacterial SelB protein. Nevertheless, despite numerous false alarms, the mammalian counterpart(s) of the bacterial SelB remained elusive. These ‘scarlet pimpernel’ properties can be explained by the discovery that, in mammals, the functions of SelB are divided into two proteins.

Figure 1: Redefining the stop codon in mammalian messenger RNAs.

Ribosomes move along mRNA, deciphering the nucleotide sequence and making a protein according to the encoded amino-acid sequence. The nucleotide sequence UGA normally specifies that the ribosome should stop translation. But sometimes this stop codon can be redefined, so that the twenty-first amino acid — selenocysteine — is incorporated instead. This model shows how this might be done11. a, A ‘stem loop’ structure in the downstream, untranslated part of the mRNA binds to a protein called SBP2. SBP2 in turn binds to the eEFsec protein, which itself has recruited the transfer RNA carrying selenocysteine. b, The selenocysteine-bound tRNA is then delivered to the waiting UGA, for incorporation into the growing amino-acid string that constitutes the newly created protein.

One protein, called SBP2 (SECIS-binding protein 2), binds the SECIS element5. The binding specificity5 of this protein depends on a key feature of the SECIS — a quartet of non-Watson–Crick base pairs6. But this protein does not have the task of binding to the selenocysteine-carrying tRNA, and it does not have the sequence features that would be expected of a protein that brings tRNAs to the ribosome. The discovery of the mammalian protein that does this — called eEFsec1,2 — was helped by the finding7 of a specialized elongation factor in Methanococcus jannaschii, a microorganism from the Archaea. This elongation factor from M. jannaschii does not bind to the SECIS element, but does bind to the selenocysteine tRNA.

Tujebajeva et al.1 and Fagegaltier et al.2 have now discovered the mammalian counterpart of this archaeal protein, by searching through sequence databases using the amino-acid sequence of the archaeal protein as a starting point. Like its archaeal counterpart, the mammalian protein does not bind the SECIS element but does interact directly with both tRNAs bearing selenocysteine1,2 and SBP2 (ref. 1). So the SECIS element — through a two-protein complex containing SBP2 and eEFsec — can recruit selenocysteine-carrying tRNAs ( Fig. 1).

But this protein complex, when bound to the stem-loop structure, is a long way from the UGA codon. How does the distant complex find the waiting ribosome and deliver selenocysteine? This is especially perplexing in the case of a protein called SelP, whose mRNA has between 10 and 17 UGA codons depending on the species8,9, each coding for selenocysteine. For such cases, one could imagine a processive model based on the known proximity of the two ends of an mRNA strand. In this model the SECIS element would deliver the SBP2 complex to a ribosome that is just starting translation. When the ribosome reaches the first UGA codon, the required selenocysteine is already with the ribosome, ready to be inserted into the growing protein. Afterwards, the ribosome-bound tRNA might be able to pick up a second tRNA-bound selenocysteine, and so on. But, if operative, this or other processivity models must have sophisticated aspects that are not yet apparent10.

The obvious model, by analogy with the situation in E. coli, is that the protein complex, bound to the tRNA and to the SECIS element, reaches back and delivers the tRNA (plus selenocysteine) directly to the ribosome11 (Fig. 1). But how the loaded SECIS finds the waiting ribosome remains a mystery. With the new proteins1,2,5 now identified, one hopes that answers to these questions will not be long in coming. Help will no doubt come from studies of the supramacromolecular translation complexes12 that may coordinate interactions central to the decoding of genetic text.


  1. 1

    Tujebajeva, R. M. et al. EMBO Rep. 1,158–163 (2000).

    CAS  Article  Google Scholar 

  2. 2

    Fagegaltier, D. et al. EMBO J. 19, 4796–4805 (2000).

    CAS  Article  Google Scholar 

  3. 3

    Böck, A. BioFactors 11, 77–78 ( 2000).

    Article  Google Scholar 

  4. 4

    Berry, M. J. et al. Nature 353, 273–276 (1991).

    ADS  CAS  Article  Google Scholar 

  5. 5

    Copeland, P. R., Fletcher, J. E., Carlson, B. A., Hatfield, D. L. & Driscoll, D. M. EMBO J. 19, 306–314 (2000).

    CAS  Article  Google Scholar 

  6. 6

    Walczak, R., Carbon, P. & Krol, A. RNA 4, 74–84 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7

    Rother, M., Wilting, S., Commans, S. & Böck, A. J. Mol. Biol. 299, 351–358 ( 2000).

    CAS  Article  Google Scholar 

  8. 8

    Hill, K. E., Lloyd, S. & Burk, R. F. Proc. Natl Acad. Sci. USA 90, 537 –541 (1993).

    ADS  CAS  Article  Google Scholar 

  9. 9

    Tujebajeva, R. M., Ransom, D. G., Harney, J. W. & Berry, M. J. Genes Cells (in the press).

  10. 10

    Nasim, M. T. et al. J. Biol. Chem. 275, 14846– 14852 (2000).

    CAS  Article  Google Scholar 

  11. 11

    Low, S. C. & Berry, M. J. Trends Biochem. Sci. 21, 203–208 (1996).

    CAS  Article  Google Scholar 

  12. 12

    Stapulionis, R., Kolli, S. & Deutscher, M. P. J. Biol. Chem. 272, 24980– 24986 (1997).

    CAS  Article  Google Scholar 

Download references

Author information



Rights and permissions

Reprints and Permissions

About this article

Cite this article

Atkins, J., Gesteland, R. The twenty-first amino acid. Nature 407, 463–464 (2000).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing