Abstract
We present a large-scale approach to investigate the functional consequences of sequence variation in a protein. The approach entails the display of hundreds of thousands of protein variants, moderate selection for activity and high-throughput DNA sequencing to quantify the performance of each variant. Using this strategy, we tracked the performance of >600,000 variants of a human WW domain after three and six rounds of selection by phage display for binding to its peptide ligand. Binding properties of these variants defined a high-resolution map of mutational preference across the WW domain; each position had unique features that could not be captured by a few representative mutations. Our approach could be applied to many in vitro or in vivo protein assays, providing a general means for understanding how protein function relates to sequence.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
An Atlas of Variant Effects to understand the genome at nucleotide resolution
Genome Biology Open Access 03 July 2023
-
mutscan—a flexible R package for efficient end-to-end analysis of multiplexed assays of variant effect data
Genome Biology Open Access 01 June 2023
-
Designed active-site library reveals thousands of functional GFP variants
Nature Communications Open Access 20 May 2023
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout




Accession codes
References
Sidhu, S.S. & Koide, S. Phage display for engineering and analyzing protein interaction interfaces. Curr. Opin. Struct. Biol. 17, 481–487 (2007).
Matouschek, A., Kellis, J.T. Jr., Serrano, L. & Fersht, A.R. Mapping the transition state and pathway of protein folding by protein engineering. Nature 340, 122–126 (1989).
Cunningham, B.C. & Wells, J.A. High-resolution epitope mapping of hGH-receptor interactions by alanine-scanning mutagenesis. Science 244, 1081–1085 (1989).
Levin, A.M. & Weiss, G.A. Optimizing the affinity and specificity of proteins with molecular display. Mol. Biosyst. 2, 49–57 (2006).
Pal, G., Kouadio, J.L., Artis, D.R., Kossiakoff, A.A. & Sidhu, S.S. Comprehensive and quantitative mapping of energy landscapes for protein-protein interactions by rapid combinatorial scanning. J. Biol. Chem. 281, 22378–22385 (2006).
Dias-Neto, E. et al. Next-generation phage display: integrating and comparing available molecular tools to enable cost-effective high-throughput analysis. PLoS ONE 4, e8338 (2009).
Ge, X., Mazor, Y., Hunicke-Smith, S.P., Ellington, A.D. & Georgiou, G. Rapid construction and characterization of synthetic antibody libraries without DNA amplification. Biotechnol. Bioeng. 106, 347–357 (2010).
Di Niro, R. et al. Rapid interactome profiling by massive sequencing. Nucleic Acids Res. 38, e110 (2010).
Macias, M.J., Wiesner, S. & Sudol, M. WW and SH3 domains, two different scaffolds to recognize proline-rich ligands. FEBS Lett. 513, 30–37 (2002).
Espanel, X. et al. Probing WW domains to uncover and refine determinants of specificity in ligand recognition. Cytotechnology 43, 105–111 (2003).
Jager, M., Nguyen, H., Crane, J.C., Kelly, J.W. & Gruebele, M. The folding mechanism of a beta-sheet: the WW domain. J. Mol. Biol. 311, 373–393 (2001).
Jiang, X., Kowalski, J. & Kelly, J.W. Increasing protein stability using a rational approach combining sequence homology and structural alignment: stabilizing the WW domain. Protein Sci. 10, 1454–1465 (2001).
Kasanov, J., Pirozzi, G., Uveges, A.J. &, Kay, B.K. Characterizing class I WW domains defines key specificity determinants and generates mutant domains with novel specificities. Chem. Biol. 8, 231–241 (2001).
Koepf, E.K. et al. Characterization of the structure and function of W → F WW domain variants: identification of a natively unfolded protein that folds upon ligand binding. Biochemistry 38, 14338–14351 (1999).
Nguyen, H., Jager, M., Moretto, A., Gruebele, M. & Kelly, J.W. Tuning the free-energy landscape of a WW domain by temperature, mutation, and truncation. Proc. Natl. Acad. Sci. USA 100, 3948–3953 (2003).
Pires, J.R. et al. Solution structures of the YAP65 WW domain and the variant L30 K in complex with the peptides GTPPPPYTVG, N-(n-octyl)-GPPPY and PLPPY and the application of peptide libraries reveal a minimal binding epitope. J. Mol. Biol. 314, 1147–1156 (2001).
Toepert, F., Pires, J.R., Landgraf, C., Oschkinat, H. & Schneider-Mergener, J. Synthesis of an array comprising 837 variants of the hYAP WW protein domain. Angew. Chem. Int. Edn Engl. 40, 897–900 (2001).
Yanagida, H., Matsuura, T. & Yomo, T. Compensatory evolution of a WW domain variant lacking the strictly conserved Trp residue. J. Mol. Evol. 66, 61–71 (2008).
Dalby, P.A., Hoess, R.H. & DeGrado, W.F. Evolution of binding affinity in a WW domain probed by phage display. Protein Sci. 9, 2366–2376 (2000).
Dai, M. et al. Using T7 phage display to select GFP-based binders. Protein Eng. Des. Sel. 21, 413–424 (2008).
Quail, M.A. et al. A large genome center's improvements to the Illumina sequencing system. Nat. Methods 5, 1005–1010 (2008).
Knight, R. & Yarus, M. Analyzing partially randomized nucleic acid pools: straight dope on doping. Nucleic Acids Res. 31, e30 (2003).
Weiss, G.A., Watanabe, C.K., Zhong, A., Goddard, A. & Sidhu, S.S. Rapid mapping of protein functional epitopes by combinatorial alanine scanning. Proc. Natl. Acad. Sci. USA 97, 8950–8954 (2000).
Guo, H.H., Choe, J. & Loeb, L.A. Protein tolerance to random amino acid change. Proc. Natl. Acad. Sci. USA 101, 9205–9210 (2004).
Das, R. & Baker, D. Macromolecular modeling with Rosetta. Annu. Rev. Biochem. 77, 363–382 (2008).
Kortemme, T. & Baker, D. A simple physical model for binding energy hot spots in protein-protein complexes. Proc. Natl. Acad. Sci. USA 99, 14116–14121 (2002).
Bershtein, S., Segal, M., Bekerman, R., Tokuriki, N. & Tawfik, D.S. Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein. Nature 444, 929–932 (2006).
Weinreich, D.M., Delaney, N.F., Depristo, M.A. & Hartl, D.L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006).
Ge, B. et al. Survey of allelic expression using EST mining. Genome Res. 15, 1584–1591 (2005).
Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).
Acknowledgements
We thank C. Lee and J. Shendure for assistance with DNA sequencing, and J. Kelly, J. Thomas, J. Hesselberth, E. Phizicky, A. Rubin, L. Starita and K. McGarvey for helpful comments and discussion. This work was supported by the US National Institutes of Health (P41 RR11823 to S.F. and D.B., and F32GM084699 to D.M.F.). S.F. and D.B. were supported by the Howard Hughes Medical Institute.
Author information
Authors and Affiliations
Contributions
D.M.F. conceived of the method, carried out the experiments, analyzed the data and wrote the paper; C.L.A. conceived of the method, analyzed the data and wrote the paper; J.J.S. carried out the experiments; E.H.K., S.J.F. and D.B. carried out the protein folding and binding energy calculations; and S.F. conceived of the method and wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–11, Supplementary Tables 1–2, Supplementary Note 1 (PDF 13072 kb)
Rights and permissions
About this article
Cite this article
Fowler, D., Araya, C., Fleishman, S. et al. High-resolution mapping of protein sequence-function relationships. Nat Methods 7, 741–746 (2010). https://doi.org/10.1038/nmeth.1492
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.1492
This article is cited by
-
A comprehensive map of human glucokinase variant activity
Genome Biology (2023)
-
satmut_utils: a simulation and variant calling package for multiplexed assays of variant effect
Genome Biology (2023)
-
mutscan—a flexible R package for efficient end-to-end analysis of multiplexed assays of variant effect data
Genome Biology (2023)
-
An Atlas of Variant Effects to understand the genome at nucleotide resolution
Genome Biology (2023)
-
Deep mutational scan of a drug efflux pump reveals its structure–function landscape
Nature Chemical Biology (2023)