<@include file="/horizon/includes/leftnav_prot_background.html"> <@include file="/horizon/includes/leftnav_logos.html">
Protein folding - what gives a protein its shape?
Background

The importance of protein folding

Joachim Pietzsch

Proteins are the biological workhorses that carry out vital functions in every cell. To carry out their task, proteins must fold into a complex three-dimensional structure — but what tells a protein which shape it should be and how does it achieve this?

Of all the molecules found in living organisms, proteins are the most important. They are used to support the skeleton, control senses, move muscles, digest food, defend against infections and process emotions. Proteins come in all shapes and sizes (Fig. 1) — they can be round (like haemoglobin), long (like collagen), strong (like spectrin c which protects erythrocytes (the cells that carry oxygen from the lungs to our tissues) from the powerful shearing forces they’re exposed to), or elastic (like titin, which controls muscle stretching and contraction). The name protein is derived from the Greek word prôtos, meaning ‘primary’ or ‘first rank of importance’ — and with good reason. These are the most abundant component within a cell — more than half the dry weight of a cell is made up of proteins — and they have a range of indispensable roles; for example, enzymes, the biocatalysts that carry out crucial biochemical reactions in every cell that would otherwise be too slow to sustain life.

What is remarkable is that the more than 100,000 proteins in our bodies are produced from a set of only 20 building blocks, known as amino acids. All amino acids have the same basic structure — an amino group, a carboxyl group and a hydrogen atom — but differ due to the presence of a side-chain (known as R; (Fig. 2)). This side-chain varies dramatically between amino acids, from a simple hydrogen atom in the amino acid glycine to a complex structure found in tryptophan. Depending on the nature of the side-chain, an amino acid can be hydrophilic (water-attracting) or hydrophobic (water-repelling), acidic or basic; and it is this diversity in side-chain properties that gives each protein its specific character.

Creating a functional protein
The sequence of amino acids in a protein defines its primary structure. The blueprint for each amino acid is laid down by sets of three letters — known as base triplets — that are found in the coding regions of genes. These base triplets are recognized by ribosomes, the protein building sites of the cell, which create and successively join the amino acids together. This is a remarkably quick process: a protein of 300 amino acids will be made in little more than a minute.

The result is a linear chain of amino acids, but this only becomes a functional protein when it folds into its three-dimensional (tertiary structure) form. This occurs through an intermediate form, known as secondary structure, the most common of which are the rod-like a-helix and the plate-like b-pleated sheet (Fig. 3). These secondary structures are formed by a small number of amino acids that are close together, which then, in turn, interact, fold and coil to produce the tertiary structure that contains its functional regions (called domains).

Although it is possible to deduce the primary structure of a protein from a gene’s sequence, its tertiary structure cannot be determined (although it should become possible to make predictions when more tertiary sequences are submitted to databases). It can only be determined by complex experimental analyses and, at present, this information is only known for about 10% of proteins. It is therefore not yet known how an amino-acid chain folds into its tertiary structure in the short time scale (fractions of a second) that occurs in the cell. So, there is a huge gap in our knowledge of how we move from protein sequence to function in living organisms: the line of sight from the genetic blueprint for a protein to its biological function is blocked by the impenetrable jungle of protein folding, and some researchers believe that clearing this jungle is the most important task in biochemistry at present.

The quest to understand protein folding
One of the most important results in understanding the process of protein folding was a thought-provoking experiment that was carried out by Christian Anfinsen and colleagues in the early 1960s. They investigated a protein called ribonuclease, which they isolated from the pancreatic tissue of cattle. This enzyme, made up of 124 amino acids, cleaves any ribonucleic acid (RNA) that could be harmful to the cell, such as truncated RNA that would not make a fully operational protein. To do this — although this was not known in Anfinsen’s time — it briefly binds RNA in a binding site and requires several sulphur-containing amino-acid cysteine residues in the protein, which form bonds with each other (called disulphide bridges) and hold the protein structure together.

Ribonuclease can be denatured by adding certain chemicals or by heat. The disulphide bridges break and other forces of attraction between amino acids disappear, which makes the enzyme collapse into a tangled, useless ball. In various studies, Anfinsen showed that this denaturation process could be completely reversed by removing these denaturing chemicals or by lowering the temperature. The ribonuclease then folds back to its natural functional state on its own. So, Anfinsen concluded that the amino-acid sequence determines the shape of a protein, a finding for which Anfinsen received the Nobel Prize in Chemistry in 1972.

But, if this is true, how do proteins find the right conformation out of the simply endless number of potential three-dimensional forms that it could randomly fold into? After all, the folding of a protein is not a chemical reaction, with a bond breaking here and a new one forming there. It is more like the weaving of an intertwined molecular pattern, the stability of which is defined by innumerable forces between atoms. Indeed, Cyrus Levinthal calculated in 1969 that finding the strongest attraction by simple trial and error would be impossible. He said that even if a protein only consisted of 100 amino acids and each of these flexible residues could only take on two different spatial orientations, the protein could theoretically adopt as many as 1030 possible conformations. Assuming a protein could try out 100 billion different conformations per second, it would still take 100 billion years to try all possibilities. So, Levinthal suggested that nature must have devised more effective methods to achieve this — and postulated the existence of defined sets of folding pathways by which protein folding can take place rapidly.

However, we now know that such fixed protein folding pathways do not seem to exist. Various protein folding pathways that have been investigated experimentally and theoretically in recent years have thrown up interesting hypotheses, but have remained hard to prove in working models. In the case of proteins of less than 100 amino acids, only two levels of folding can be observed, the unfolded protein (which occurs in numerous forms) and the finished, folded, functional protein. For larger proteins, three steps can be observed. The intermediate is either a so-called ‘molten globule’, which is formed by a process called hydrophobic collapse (in which all hydrophobic side-chains suddenly slide inside the protein or clump together) or a structure in which the secondary structures of the protein are already fully formed. However, there is disagreement about whether these intermediates are formed en route to the correct folding pattern, or whether they represent structural cul-de-sacs.

Finding the energy to fold
As with all processes in nature, protein folding also needs energy — the process has to obey the laws of thermodynamics. A protein always folds so that it achieves the lowest possible energy — just as we always try to adopt the most comfortable position, in which we need to move about least, when going to sleep. It is thought that this is achieved by using an energy gradient or ‘funnel’ along the path from the random tangle to the folded protein. Alan Fersht of Cambridge University used the following analogy to illustrate this model: if you blindfold a golfer and let him hit the ball in any direction he likes, the probability that he will hole the ball is almost infinitesimal. The same is true of a protein finding the right form by chance. However, if all parts of the golf course slope toward the hole, which is at the lowest point in the area, even a blindfolded golfer has a good chance of finding the hole. So, fixed reaction pathways are not necessary, as each protein seeks out its natural shape through a funnel of declining energy; it can take many folding routes and still reach its target of the completed tertiary structure.

A helping hand
This understanding of protein folding was obtained from computer models (in silico) or from experiments in the laboratory (in vitro) in which an individual protein was denatured to observe it folding back into its original form. But, the situation is considerably more complex in the living cell (in vivo). Although the fundamental energy rules also apply here, foldingat least of large proteinsrarely takes place spontaneously, as the ribosomes do not synthesize only one protein at a time. Instead, cells contain a vast number of proteins and other biomolecules at the extraordinarily high concentration of 340 grams per litre. Ordered protein folding in this cramped chaos is only possible under the supervision of specialized molecules, called chaperones, which accompany proteins and make sure that those that are being formed at the ribosomes do not clump together prematurely (Fig. 4). Chaperones do not merely oversee the folding of the protein, they also protect its tertiary structure in situations in which the cell is under stress; for example, elevated body temperature, so these chaperones have also been classified as heat-shock proteins (HSPs).

The HSP70s, so called because they have a molecular weight of 70 kilodaltons, are the most important class of chaperones. They bind to the developing protein chain and protect those parts of the newly formed protein that are particularly sensitive to premature reaction with the environment and therefore to malformation. When they let go of the new protein chain it is ready to fold. It is now taken over by a chaperonin, a molecule shaped like a double ring, which fits round the protein chain like a cylinder so that it can fold undisturbed inside (Fig. 4). Although the cylindrical folding cage opens every ~10 seconds, the protein only leaves the chaperonin when it has achieved its required form.

We now understand better than ever how protein folding — both in vitro and in vivo — takes place. And this, in turn, has given us a better understanding of the origin and course of diseases that are associated with defective protein folding. But why the normal folding of every protein always runs towards a predetermined goal and what this goal looks like — that is, what the instructions in the primary structure are that determine the correct tertiary structure — is still a great mystery, even though the number of proposed models has dramatically increased. Computer simulations cannot yet solve the folding code that is hidden in the primary structure by simply calculating the molecular dynamics atom by atom, as to work through just 50 milliseconds of folding would take even the fastest computer around 30,000 years. Any realistic hope of cracking the folding code, such as to produce special designer proteins that evolution had not planned, is probably a very long way off. However, our improved understanding of the route that a protein must take from its synthesis to the correct folded form already enables us to contemplate better treatments or even cures for diseases in which proteins have departed from the correct folding route (see Treating protein folding diseases).

 
 

Further Reading
Encyclopedia of Life Sciences: Chaperones | Fundamental properties of proteins | Predicting protein secondary structure | Protein folding pathways

 
 
 
   
<@include file="/horizon/includes/footer_2003.html">