NOTE: In the version of this Review initially published, an author (B. Martin Hallberg) was left off of the author list. This information has been added to the HTML and PDF versions of the Review.
In selecting a method to produce a recombinant protein, a researcher is faced with a bewildering array of choices as to where to start. To facilitate decision-making, we describe a consensus 'what to try first' strategy based on our collective analysis of the expression and purification of over 10,000 different proteins. This review presents methods that could be applied at the outset of any project, a prioritized list of alternate strategies and a list of pitfalls that trip many new investigators.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Sauder, J.M. et al. High throughput protein production and crystallization at NYSGXRC. in Structural Proteomics: High-Throughput Methods Vol. 426 (eds. B. Kobe, M.Guss & H. Thomas) 561–575 (Humana Press, Totowa, New Jersey, USA, 2008).
Büssow, K. et al. Structural genomics of human proteins-target selection and generation of a public catalogue of expression clones. Microb. Cell Fact. 4, 21 (2005).
Heinemann, U., Büssow, K., Mueller, U. & Umbach, P. Facilities and methods for the high-throughput crystal structural analysis of human proteins. Accounts Chem. Res. 36, 157–163 (2003).
Banci, L. et al. First steps towards effective methods in exploiting high-throughput technologies for the determination of human protein structures of high biomedical value. Acta Crystallogr. D62, 1208–1217 (2006).
Aricescu, A.R. et al. Eukaryotic expression: developments for structural proteomics. Acta Crystallogr. D62, 1114–1124 (2006).
Heinemann, U. Establishing a structural genomics platform: The Berlin-based Protein Structure Factory. Gene Funct. Dis. 3, 25–32 (2002).
Scheich, C., Kummel, D., Soumailakakis, D., Heinemann, U. & Büssow, K. Vectors for co-expression of an unrestricted number of proteins. Nucleic Acids Res. 35, e43 (2007).
Bartlam, M., Xu, Y. & Rao, Z. Structural proteomics of the SARS coronavirus: a model response to emerging infectious diseases. J. Struct. Funct. Genomics 8, 85–97 (2007).
Gong, W.M. et al. Structural genomics efforts at the Chinese Academy of Sciences and Peking University. J. Struct. Funct. Genomics 4, 137–139 (2003).
Albeck, S. et al. Three-dimensional structure determination of proteins related to human health in their functional context at The Israel Structural Proteomics Center (ISPC). Acta Crystallogr. D61, 1364–1372 (2005).
Lesley, S.A. et al. Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline. Proc. Natl. Acad. Sci. USA 99, 11664–11669 (2002).
Montelione, G.T., Zheng, D., Huang, Y.J., Gunsalus, K.C. & Szyperski, T. Protein NMR spectroscopy in structural genomics. Nat. Struct. Biol. 7 (Suppl.), 982–985 (2000).
Acton, T.B. et al. Robotic cloning and Protein Production Platform of the Northeast Structural Genomics Consortium. Methods Enzymol. 394, 210–243 (2005).
Alzari, P.M. et al. Implementation of semi-automated cloning and prokaryotic expression screening: the impact of SPINE. Acta Crystallogr. D62, 1103–1113 (2006).
Gileadi, O. The scientific impact of the Structural Genomics Consortium: a protein family and ligand-centered approach to medically-relevant human proteins. J. Struct. Funct. Genomics 8, 107–119 (2007).
Gileadi, O. et al. Methods in Molecular Biology. in Structural Proteomics: High-Throughput Methods. Vol. 426 (eds., B. Kobe, M. Guss & T. Huber) 222–246 (Humana Press, Totowa, New Jersey, USA, 2008).
You, J., Cohen, R.E. & Pickart, C.M. Construct for high-level expression and low misincorporation of lysine for arginine during expression of pET-encoded eukaryotic proteins in Escherichia coli. Biotechniques 27, 950–954 (1999).
Klock, H.E., Koesema, E.J., Knuth, M.W. & Lesley, S.A. Combining the polymerase incomplete primer extension method for cloning and mutagenesis with microscreening to accelerate structural genomics efforts. Proteins published online, doi: 10.1002/prot.21786 (14 November 2007).
Gräslund, S. et al. The use of systematic N- and C-terminal deletions to promote production and structural studies of recombinant proteins. Protein Expr. Purif. (in the press).
Ginalski, K., Elofsson, A., Fischer, D. & Rychlewski, L. 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19, 1015–1018 (2003).
Ward, J.J., McGuffin, L.J., Bryson, K., Buxton, B.F. & Jones, D.T. The DISOPRED server for the prediction of protein disorder. Bioinformatics 20, 2138–2139 (2004).
Yang, Z.R., Thomson, R., McNeil, P. & Esnouf, R.M. RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21, 3369–3376 (2005).
Cornvik, T. et al. An efficient and generic strategy for producing soluble human proteins and domains in E. coli by screening construct libraries. Proteins 65, 266–273 (2006).
Gao, X. et al. High-throughput limited proteolysis/mass spectrometry for protein domain elucidation. J. Struct. Funct. Genomics 6, 129–134 (2005).
Hartley, J.L., Temple, G.F. & Brasch, M.A. DNA cloning using in vitro site-specific recombination. Genome Res. 10, 1788–1795 (2000).
Aslanidis, C. & de Jong, P.J. Ligation-independent cloning of PCR products (LIC-PCR). Nucleic Acids Res. 18, 6069–6074 (1990).
Landy, A. Dynamic, structural, and regulatory aspects of lambda site-specific recombination. Annu. Rev. Biochem. 58, 913–949 (1989).
Guo, F., Gopaul, D.N. & Van Duyne, G.D. Asymmetric DNA bending in the Cre-loxP site-specific recombination synapse. Proc. Natl. Acad. Sci. USA 96, 7143–7148 (1999).
Hammarström, M., Hellgren, N., van Den Berg, S., Berglund, H. & Härd, T. Rapid screening for improved solubility of small human proteins produced as fusion proteins in Escherichia coli. Protein Sci. 11, 313–321 (2002).
Peti, W. & Page, R. Strategies to maximize heterologous protein expression in Escherichia coli with minimal cost. Protein Expr. Purif. 51, 1–10 (2007).
Braun, P. & LaBaer, J. High throughput protein production for functional proteomics. Trends Biotechnol. 21, 383–388 (2003).
Vincentelli, R. et al. Medium-scale structural genomics: strategies for protein expression and crystallization. Accounts Chem. Res. 36, 165–172 (2003).
Studier, F.W., Rosenberg, A.H., Dunn, J.J. & Dubendorff, J.W. Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. 185, 60–89 (1990).
Uhlén, M., Forsberg, G., Moks, T., Hartmanis, M. & Nilsson, B. Fusion proteins in biotechnology. Curr. Opin. Biotechnol. 3, 363–369 (1992).
Arnau, J., Lauritzen, C., Petersen, G.E. & Pedersen, J. Current strategies for the use of affinity tags and tag removal for the purification of recombinant proteins. Protein Expr. Purif. 48, 1–13 (2006).
Carrington, J.C. & Dougherty, W.G. A viral cleavage site cassette: identification of amino acid sequences required for tobacco etch virus polyprotein processing. Proc. Natl. Acad. Sci. USA 85, 3391–3395 (1988).
Porath, J. Immobilized metal ion affinity chromatography. Protein Expr. Purif. 3, 263–281 (1992).
Nallamsetty, S. & Waugh, D. Solubility-enhancing proteins MBP and NusA play a passive role in the folding of their fusion partners. Protein Expr. Purif. 45, 175–182 (2006).
Nallamsetty, S. & Waugh, D.S. A generic protocol for the expression and purification of recombinant proteins in Escherichia coli using a combinatorial His6-maltose binding protein fusion tag. Nat. Protoc. 2, 383–391 (2007).
Waugh, D.S. Making the most of affinity tags. Trends Biotechnol. 23, 316–320 (2005).
Carson, M., Johnson, D.H., McDonald, H., Brouillette, C. & Delucas, L.J. His-tag impact on structure. Acta Crystallogr. 63, 295–301 (2007).
Dubendorff, J.W. & Studier, F.W. Controlling basal expression in an inducible T7 expression system by blocking the target T7 promoter with lac repressor. J. Mol. Biol. 219, 45–59 (1991).
Wycuff, D.R. & Matthews, K.S. Generation of an AraC-araBAD promoter-regulated T7 expression system. Anal. Biochem. 277, 67–73 (2000).
Shimizu, Y. et al. Cell-free translation reconstituted with purified components. Nat. Biotechnol. 19, 751–755 (2001).
Guzman, L.M., Belin, D., Carson, M.J. & Beckwith, J. Tight regulation, modulation, and high-level expression by vectors containing the arabinose pBAD promoter. J. Bacteriol. 177, 4121–4130 (1995).
Studier, F.W. Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005).
Lesley, S.A. High-throughput proteomics: protein expression and purification in the postgenomic world. Protein Expr. Purif. 22, 159–164 (2001).
Tunac, J. A new high-aeration capacity shake-flask system. J. Ferm. Bioeng. 68, 15–159 (1989).
Brodsky, O. & Cronin, C.N. Economical parallel protein expression screening and scale-up in Escherichia coli. J. Struct. Funct. Genomics 7, 101–108 (2006).
Vera, A., Gonzalez-Montalban, N., Aris, A. & Villaverde, A. The conformational quality of insoluble recombinant proteins is enhanced at low growth temperatures. Biotechnol. Bioeng. 96, 1101–1106 (2007).
Berrow, N.S. et al. Recombinant protein expression and solubility screening in Escherichia coli: a comparative study. Acta Crystallogr. D62, 1218–1226 (2006).
Page, R. et al. Scalable high-throughput micro-expression device for recombinant proteins. Biotechniques 37, 364–370 (2004).
Bradford, M.M. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72, 248–254 (1976).
Bolanos-Garcia, V.M. & Davies, O.R. Structural analysis and classification of native proteins from E. coli commonly co-purified by immobilised metal affinity chromatography. Biochim. Biophys. Acta 1760, 1304–1313 (2006).
Howell, J.M., Winstone, T.L., Coorssen, J.R. & Turner, R.J. An evaluation of in vitro protein-protein interaction techniques: assessing contaminating background proteins. Proteomics 6, 2050–2069 (2006).
Bullock, A.N., Debreczeni, J., Amos, A.L., Knapp, S. & Turk, B.E. Structure and substrate specificity of the Pim-1 kinase. J. Biol. Chem. 280, 41675–41682 (2005).
Alam, M., Ho, S., Vance, D.E. & Lehner, R. Heterologous expression, purification, and characterization of human triacylglycerol hydrolase. Protein Expr. Purif. 24, 33–42 (2002).
Kapust, R.B. & Waugh, D.S. Escherichia coli maltose-binding protein is uncommonly effective at promoting the solubility of polypeptides to which it is fused. Protein Sci. 8, 1668–1674 (1999).
Iakoucheva, L.M., Brown, C.J., Lawson, J.D., Obradovic, Z. & Dunker, A.K. Intrinsic disorder in cell-signaling and cancer-associated proteins. J. Mol. Biol. 323, 573–584 (2002).
Frenkiel-Krispin, D. et al. Plant transformation by Agrobacterium tumefaciens: modulation of single-stranded DNA-VirE2 complex assembly by VirE1. J. Biol. Chem. 282, 3458–3464 (2007).
Tolia, N.H. & Joshua-Tor, L. Strategies for protein coexpression in Escherichia coli. Nat. Methods 3, 55–64 (2006).
Romier, C. et al. Co-expression of protein complexes in prokaryotic and eukaryotic hosts: experimental procedures, database tracking and case studies. Acta Crystallogr. 62, 1232–1242 (2006).
Bullock, A.N., Debreczeni, J.E., Edwards, A.M., Sundström, M. & Knapp, S. Crystal structure of the SOCS2-elongin C-elongin B complex defines a prototypical SOCS box ubiquitin ligase. Proc. Natl. Acad. Sci. USA 103, 7637–7642 (2006).
Vedadi, M. et al. Chemical screening methods to identify ligands that promote protein stability, protein crystallization, and structure determination. Proc. Natl. Acad. Sci. USA 103, 15835–15840 (2006).
Niesen, F.H., Berglund, H. & Vedadi, M. The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat. Protoc. 2, 2212–2221 (2007).
Elleby, B. et al. High-level production and optimization of monodispersity of 11beta-hydroxysteroid dehydrogenase type 1. Biochim. Biophys. Acta 1700, 199–207 (2004).
Strauss, A. et al. Improved expression of kinases in Baculovirus-infected insect cells upon addition of specific kinase inhibitors to the culture helpful for structural studies. Protein Expr. Purif. 56, 167–176 (2007).
Smith, G.E., Summers, M.D. & Fraser, M.J. Production of human beta interferon in insect cells infected with a baculovirus expression vector. Mol. Cell. Biol. 3, 2156–2165 (1983).
Boettner, M., Prinz, B., Holz, C., Stahl, U. & Lang, C. High-throughput screening for expression of heterologous proteins in the yeast Pichia pastoris. J. Biotechnol. 99, 51–62 (2002).
Holz, C., Hesse, O., Bolotina, N., Stahl, U. & Lang, C. A micro-scale process for high-throughput expression of cDNAs in the yeast Saccharomyces cerevisiae. Protein Expr. Purif. 25, 372–378 (2002).
Aricescu, A.R., Lu, W. & Jones, E.Y. A time- and cost-efficient system for high-level protein production in mammalian cells. Acta Crystallogr. D62, 1243–1250 (2006).
Yokoyama, S. Protein expression systems for structural genomics and proteomics. Curr. Opin. Chem. Biol. 7, 39–43 (2003).
Kigawa, T. et al. Preparation of Escherichia coli cell extract for highly productive cell-free protein expression. J. Struct. Funct. Genomics 5, 63–68 (2004).
Matsuda, T. et al. Cell-free synthesis of zinc-binding proteins. J. Struct. Funct. Genomics 7, 93–100 (2006).
Endo, Y. & Sawasaki, T. Cell-free expression systems for eukaryotic protein production. Curr. Opin. Biotechnol. 17, 373–380 (2006).
Mikami, S., Masutani, M., Sonenberg, N., Yokoyama, S. & Imataka, H. An efficient mammalian cell-free translation system supplemented with translation factors. Protein Expr. Purif. 46, 348–357 (2006).
Yokoyama, S., Terwilliger, T.C., Kuramitsu, S., Moras, D. & Sussman, J.L. RIKEN aids international structural genomics efforts. Nature 445, 21 (2007).
Murayama, K. et al. Crystal structure of the rac activator, Asef, reveals its autoinhibitory mechanism. J. Biol. Chem. 282, 4238–4242 (2007).
Miyazono, K. et al. Novel protein fold discovered in the PabI family of restriction enzymes. Nucleic Acids Res. 35, 1908–1918 (2007).
Nishihara, K., Kanemori, M., Yanagi, H. & Yura, T. Overexpression of trigger factor prevents aggregation of recombinant proteins in Escherichia coli. Appl. Environ. Microbiol. 66, 884–889 (2000).
Wynn, R.M., Song, J.L. & Chuang, D.T. GroEL/GroES promote dissociation/reassociation cycles of a heterodimeric intermediate during alpha(2)beta(2) protein assembly. Iterative annealing at the quaternary structure level. J. Biol. Chem. 275, 2786–2794 (2000).
Kaiser, C.M. et al. Real-time observation of trigger factor function on translating ribosomes. Nature 444, 455–460 (2006).
Willis, M.S. et al. Investigation of protein refolding using a fractional factorial screen: a study of reagent effects and interactions. Protein Sci. 14, 1818–1826 (2005).
Vincentelli, R. et al. High-throughput automated refolding screening of inclusion bodies. Protein Sci. 13, 2782–2792 (2004).
Kim, S.H. et al. Structural genomics of minimal organisms and protein fold space. J. Struct. Funct. Genomics 6, 63–70 (2005).
Lesley, S.A. & Wilson, I.A. Protein production and crystallization at the joint center for structural genomics. J. Struct. Funct. Genomics 6, 71–79 (2005).
Kreusch, A. & Lesley, S.A. High throughput cloning, expression and purification technologies. in Genomics, Proteomics, and Vaccines (ed.,G. Grandi) 171–184 (Wiley Press, Chichester, UK, 2004).
McMullan, D. et al. High-throughput protein production for X-ray crystallography and use of size exclusion chromatography to validate or refute computational biological unit predictions. J. Struct. Funct. Genomics 6, 135–141 (2005).
Stols, L., Millard, C.S., Dementieva, I. & Donnelly, M.I. Production of selenomethionine-labeled proteins in two-liter plastic bottles for structure determination. J. Struct. Funct. Genomics 5, 95–102 (2004).
Geerlof, A. et al. The impact of protein characterization in structural proteomics. Acta Crystallogr. D62, 1125–1136 (2006).
The Structural Genomics Consortium is a registered charity (number 1097737) that receives funds from the Canadian Institutes for Health Research, the Canadian Foundation for Innovation, Genome Canada through the Ontario Genomics Institute, GlaxoSmithKline, Karolinska Institutet, the Knut and Alice Wallenberg Foundation, the Ontario Innovation Trust, the Ontario Ministry for Research and Innovation, Merck & Co., Inc., the Novartis Research Foundation, the Swedish Agency for Innovation Systems, the Swedish Foundation for Strategic Research and the Wellcome Trust. The New York Structural GenomiX Research Center for Structural Genomics is supported by the US National Institute of General Medical Sciences (U54 GM074945). Work at the MDC was supported by the German Federal Ministry for Education and Research (BMBF) through the Leitprojektverbund Proteinstrukturfabrik and the German National Genome Network (NGFN; FKZ 01GR0471, 01GR0472), and by the Fonds der Chemischen Industrie. The Protein Sample Production Facility is funded by the Helmholtz Association of German Research Centres. The China Structural Genomics Consortium is supported by the National 863 Hi-Tech Research and Development Program of China. The Israel Structural Proteomics Center is supported by The Israel Ministry of Science, Culture and Sport, the Divadol Foundation, the Neuman Foundation, the European Commission Sixth Framework Research and Technological Development Programme 'SPINE2-Complexes' Project under contract 031220. The RIKEN Structural Genomics/Proteomics Initiative was supported by the National Project on Protein Structural and Functional Analyses, Ministry of Education, Culture, Sports, Science and Technology of Japan. The Joint Center for Structural Genomics is supported by the US National Institutes of Health (NIH) Protein Structure Initiative grant U54 GM074898 from the NIGMS. The Northeast Structural Genomics Consortium is supported by the NIH NIGMS (U54-GM074958). The Midwest Center for Structural Genomics is supported by the NIH (GM074942) and by the US Department of Energy, Office of Biological and Environmental Research (DE-AC02-06CH11357). The Oxford Protein Production Facility is funded by the UK Medical Research Council and Biotechnology and Biological Sciences Research Council. SPINE2-Complexes is funded by the European Commission (contract 031220) under the Framework 6 RTD Programme and is coordinated from the Division of Structural Biology, Wellcome Trust Centre for Human Genetics, Oxford, UK. The Berkeley Structural Genomics Center is supported by the NIH (GM62412). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIGMS or the NIH.
About this article
Cite this article
Structural Genomics Consortium., Architecture et Fonction des Macromolécules Biologiques., Berkeley Structural Genomics Center. et al. Protein production and purification. Nat Methods 5, 135–146 (2008). https://doi.org/10.1038/nmeth.f.202
Protein purification strategies must consider downstream applications and individual biological characteristics
Microbial Cell Factories (2022)
Investigating a putative transcriptional regulatory protein encoded by Rv1719 gene of Mycobacterium tuberculosis
The Protein Journal (2022)
Analysis of Antibiofilm Activities of Bioactive Compounds from Honeyweed (Leonurus sibiricus) Against P. aeruginosa: an In Vitro and In Silico Approach
Applied Biochemistry and Biotechnology (2022)
The PshX subunit of the photochemical reaction center from Heliobacterium modesticaldum acts as a low-energy antenna
Photosynthesis Research (2022)
Expression and purification of a native Thy1-single-chain variable fragment for use in molecular imaging
Scientific Reports (2021)