Using BEAN-counter to quantify genetic interactions from multiplexed barcode sequencing experiments

Abstract

The construction of genome-wide mutant collections has enabled high-throughput, high-dimensional quantitative characterization of gene and chemical function, particularly via genetic and chemical–genetic interaction experiments. As the throughput of such experiments increases with improvements in sequencing technology and sample multiplexing, appropriate tools must be developed to handle the large volume of data produced. Here, we describe how to apply our approach to high-throughput, fitness-based profiling of pooled mutant yeast collections using the BEAN-counter software pipeline (Barcoded Experiment Analysis for Next-generation sequencing) for analysis. The software has also successfully processed data from Schizosaccharomyces pombe, Escherichia coli, and Zymomonas mobilis mutant collections. We provide general recommendations for the design of large-scale, multiplexed barcode sequencing experiments. The procedure outlined here was used to score interactions for ~4 million chemical-by-mutant combinations in our recently published chemical–genetic interaction screen of nearly 14,000 chemical compounds across seven diverse compound collections. Here we selected a representative subset of these data on which to demonstrate our analysis pipeline. BEAN-counter is open source, written in Python, and freely available for academic use. Users should be proficient at the command line; advanced users who wish to analyze larger datasets with hundreds or more conditions should also be familiar with concepts in analysis of high-throughput biological data. BEAN-counter encapsulates the knowledge we have accumulated from, and successfully applied to, our multiplexed, pooled barcode sequencing experiments. This protocol will be useful to those interested in generating their own high-dimensional, quantitative characterizations of gene or chemical function in a high-throughput manner.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Overview of multiplexed barcode sequencing experiments and their processing using the BEAN-counter software.
Fig. 2: Design of large-scale, pooled and multiplexed chemical–genetic interaction screens.
Fig. 3: Schematic of the steps involved in processing large-scale interaction screens using BEAN-counter.
Fig. 4: The large signature that we observe in and remove from most of our datasets.
Fig. 5: An inoculation date-related effect in one of our datasets.
Fig. 6: Typical barcode and index tag abundance distributions.
Fig. 7: Manual examination of the dataset to flag mutants and conditions for removal.
Fig. 8: Analysis of same-compound, same-index-tag, and same-lane correlations to detect the presence of batch effects and uninformative signal.
Fig. 9: Removal of large, uninformative signature via singular-value decomposition (SVD).

Code availability

The source code for BEAN-counter is available from https://github.com/csbio/BEAN-counter. It requires a license for use (the license can be obtained at http://z.umn.edu/beanctr). It is free for academic use and can be purchased on a per-project basis for commercial use.

Data availability

All data needed to process the example dataset into chemical–genetic interaction scores are available at http://csbio.cs.umn.edu/BEAN-counter/example_dataset/. These data are a subset of the complete large-scale chemical–genetic interaction dataset, which is available from http://mosaic.cs.umn.edu, the supplementary material of the associated article10, or the corresponding author upon reasonable request.

References

  1. 1.

    Giaever, G. et al. Genomic profiling of drug sensitivities via induced haploinsufficiency. Nat. Genet. 21, 278–283 (1999).

    CAS  Article  Google Scholar 

  2. 2.

    Parsons, A. B. et al. Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways. Nat. Biotechnol. 22, 62–69 (2004).

    CAS  Article  Google Scholar 

  3. 3.

    Parsons, A. B. et al. Exploring the mode-of-action of bioactive compounds by chemical-genetic profiling in yeast. Cell 126, 611–625 (2006).

    CAS  Article  Google Scholar 

  4. 4.

    Pierce, S. E., Davis, R. W., Nislow, C. & Giaever, G. Genome-wide analysis of barcoded Saccharomyces cerevisiae gene-deletion mutants in pooled cultures. Nat. Protoc. 2, 2958–2974 (2007).

    CAS  Article  Google Scholar 

  5. 5.

    Costanzo, M. et al. The genetic landscape of a cell. Science 327, 425–431 (2010).

    CAS  Article  Google Scholar 

  6. 6.

    Costanzo, M. et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 353, aaf1420 (2016).

    Article  Google Scholar 

  7. 7.

    Hoepfner, D. et al. High-resolution chemical dissection of a model eukaryote reveals targets, pathways and gene functions. Microbiol. Res. 169, 107–120 (2014).

    CAS  Article  Google Scholar 

  8. 8.

    Lee, A. Y. et al. Mapping the cellular response to small molecules using chemogenomic fitness signatures. Science 344, 208–211 (2014).

    CAS  Article  Google Scholar 

  9. 9.

    Estoppey, D. et al. Identification of a novel NAMPT inhibitor by CRISPR/Cas9 chemogenomic profiling in mammalian cells. Sci. Rep. 7, 42728 (2017).

    CAS  Article  Google Scholar 

  10. 10.

    Piotrowski, J. S. et al. Functional annotation of chemical libraries across diverse biological processes. Nat. Chem. Biol. 13, 982–993 (2017).

    CAS  Article  Google Scholar 

  11. 11.

    Roguev, A. et al. Conservation and rewiring of functional modules revealed by an epistasis map in fission yeast. Science 322, 405–410 (2008).

    CAS  Article  Google Scholar 

  12. 12.

    Ryan, C. J. et al. Hierarchical modularity and the evolution of genetic interactomes across species. Mol. Cell 46, 691–704 (2012).

    CAS  Article  Google Scholar 

  13. 13.

    Frost, A. et al. Functional repurposing revealed by comparing S. pombe and S. cerevisiae genetic interactions. Cell 149, 1339–1352 (2012).

    CAS  Article  Google Scholar 

  14. 14.

    Vizeacoumar, F. J. et al. A negative genetic interaction map in isogenic cancer cell lines reveals cancer cell vulnerabilities. Mol. Syst. Biol. 9, 696 (2013).

    CAS  Article  Google Scholar 

  15. 15.

    Babu, M. et al. Quantitative genome-wide genetic interaction screens reveal global epistatic relationships of protein complexes in Escherichia coli. PLoS Genet. 10, e1004120 (2014).

    Article  Google Scholar 

  16. 16.

    Hart, T. et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell 163, 1515–1526 (2015).

    CAS  Article  Google Scholar 

  17. 17.

    Hillenmeyer, M. E. et al. The chemical genomic portrait of yeast: uncovering a phenotype for all genes. Science 320, 362–365 (2008).

    CAS  Article  Google Scholar 

  18. 18.

    Wildenhain, J. et al. Prediction of synergism from chemical-genetic interactions by machine learning. Cell Syst. 1, 383–395 (2015).

    CAS  Article  Google Scholar 

  19. 19.

    Smith, A. M. et al. Quantitative phenotyping via deep barcode sequencing. Genome Res. 19, 1836–1842 (2009).

    CAS  Article  Google Scholar 

  20. 20.

    Smith, A. M. et al. Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples. Nucleic Acids Res. 38, e142 (2010).

    Article  Google Scholar 

  21. 21.

    Cleveland, W. S. Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc. 74, 829–836 (1979).

    Article  Google Scholar 

  22. 22.

    Cleveland, W. S. LOWESS: a program for smoothing scatterplots by robust locally weighted regression. Am. Stat. 35, 54 (1981).

    Article  Google Scholar 

  23. 23.

    Yang, Y. H. with contributions from Paquet, A. & Dudoit, S. marray: Exploratory analysis for two-color spotted microarray data. https://rdrr.io/bioc/marray/ (2009).

  24. 24.

    Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

    Article  Google Scholar 

  25. 25.

    Piotrowski, J. S. et al. Chemical genomic profiling via barcode sequencing to predict compound mode of action. Methods Mol. Biol. 1263, 299–318 (2015).

    CAS  Article  Google Scholar 

  26. 26.

    Piotrowski, J. S. et al. Plant-derived antifungal agent poacic acid targets β-1,3-glucan. Proc. Natl. Acad. Sci. USA 112, E1490–E1497 (2015).

    CAS  PubMed  Google Scholar 

  27. 27.

    Baryshnikova, A. et al. Quantitative analysis of fitness and genetic interactions in yeast on a genome scale. Nat. Methods 7, 1017–1024 (2010).

    CAS  Article  Google Scholar 

  28. 28.

    Morales, E. H. et al. Accumulation of heme biosynthetic intermediates contributes to the antibacterial action of the metalloid tellurite. Nat. Commun. 8, 15320 (2017).

    Article  Google Scholar 

  29. 29.

    Giaever, G. & Nislow, C. The yeast deletion collection: a decade of functional genomics. Genetics 197, 451–465 (2014).

    CAS  Article  Google Scholar 

  30. 30.

    Ho, C. H. et al. A molecular barcoded yeast ORF library enables mode-of-action analysis of bioactive compounds. Nat. Biotechnol. 27, 369–377 (2009).

    CAS  Article  Google Scholar 

  31. 31.

    Ben-Aroya, S. et al. Toward a comprehensive temperature-sensitive mutant repository of the essential genes of Saccharomyces cerevisiae. Mol. Cell 30, 248–258 (2008).

    CAS  Article  Google Scholar 

  32. 32.

    Spirek, M. et al. S. pombe genome deletion project: an update. Cell Cycle 9, 2399–2402 (2010).

    CAS  Article  Google Scholar 

  33. 33.

    Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 2006.0008 (2006).

    Article  Google Scholar 

  34. 34.

    Andrusiak, K. Adapting S. cerevisiae Chemical Genomics for Identifying the Modes of Action of Natural Compounds. Master’s thesis, University of Toronto (2012).

  35. 35.

    Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).

    CAS  Article  Google Scholar 

  36. 36.

    Bao, E., Jiang, T., Kaloshian, I. & Girke, T. SEED: efficient clustering of next-generation sequences. Bioinformatics 27, 2502–2509 (2011).

    CAS  Article  Google Scholar 

  37. 37.

    Shimizu, K. & Tsuda, K. SlideSort: all pairs similarity search for short reads. Bioinformatics 27, 464–470 (2011).

    CAS  Article  Google Scholar 

  38. 38.

    Mahé, F., Rognes, T., Quince, C., de Vargas, C. & Dunthorn, M. Swarm: robust and fast clustering method for amplicon-based studies. PeerJ 2, e593 (2014).

    Article  Google Scholar 

  39. 39.

    Zorita, E., Cuscó, P. & Filion, G. J. Starcode: sequence clustering based on all-pairs search. Bioinformatics 31, 1913–1919 (2015).

    CAS  Article  Google Scholar 

  40. 40.

    Callahan, B. J. et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).

    CAS  Article  Google Scholar 

  41. 41.

    Vetrovský, T., Baldrian, P., Morais, D. & Berger, B. SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses. Bioinformatics 34, (2018).

  42. 42.

    Zhao, L., Liu, Z., Levy, S. F. & Wu, S. Bartender: a fast and accurate clustering algorithm to count barcode reads. Bioinformatics 34, 739–747 (2018).

    CAS  Article  Google Scholar 

  43. 43.

    Dai, Z. et al. edgeR: a versatile tool for the analysis of shRNA-seq and CRISPR-Cas9 genetic screens. F1000Res. 3, 95 (2014).

  44. 44.

    Mun, J., Kim, D.-U., Hoe, K.-L. & Kim, S.-Y. Genome-wide functional analysis using the barcode sequence alignment and statistical analysis (Barcas) tool. BMC Bioinformatics 17, 475 (2016).

    Article  Google Scholar 

  45. 45.

    Robinson, D. G., Chen, W., Storey, J. D. & Gresham, D. Design and analysis of Bar-seq experiments. G3 (Bethesda) 4, 11–18 (2014).

    Article  Google Scholar 

  46. 46.

    Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

    Article  Google Scholar 

  47. 47.

    Leek, J. T. & Storey, J. D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724–1735 (2007).

    CAS  Article  Google Scholar 

  48. 48.

    Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    CAS  Article  Google Scholar 

  49. 49.

    Levy, S. F. et al. Quantitative evolutionary dynamics using high-resolution lineage tracking. Nature 519, 181–186 (2015).

    CAS  Article  Google Scholar 

  50. 50.

    Simpkins, S. W. et al. Predicting bioprocess targets of chemical compounds through integration of chemical-genetic and genetic interaction networks. Preprint at https://www.biorxiv.org/content/early/2018/05/18/111252 (2018).

Download references

Acknowledgements

S.W.S. thanks B. VanderSluis for proofreading the manuscript and testing the software and also A. Becker at the University of Minnesota Genomics Center for discussions regarding amplicon sequencing issues. This work was supported by RIKEN (http://www.riken.jp/en/) Strategic Programs for R&D, the National Institutes of Health (https://www.nih.gov/; R01HG005084, R01GM104975), and the National Science Foundation (https://www.nsf.gov/; DBI 0953881). S.W.S. was supported by an NSF Graduate Research Fellowship (00039202), an NIH Biotechnology training grant (T32GM008347), and a one-year fellowship from the University of Minnesota Bioinformatics and Computational Biology (BICB) Graduate Program (https://r.umn.edu/academics-research/graduate-programs/bicb). S.C.L. and J.S.P. were supported by a RIKEN Foreign Postdoctoral Research Fellowship. S.C.L. was supported by a RIKEN CSRS (http://www.csrs.riken.jp/en/) Research Topics for Cooperative Projects Award (201601100228) and a RIKEN FY2017 Incentive Research Projects Grant. H.N.W. was supported by a one-year BICB fellowship from the University of Minnesota. C.B. was supported by JSPS KAKENHI (https://www.jsps.go.jp/english/e-grants/) grant no. 15H04483. C.L.M. and C.B. are fellows in the Canadian Institute for Advanced Research (CIFAR, https://www.cifar.ca/) Genetic Networks Program. Computing resources and data storage services were partially provided by the Minnesota Supercomputing Institute and the UMN Office of Information Technology, respectively. Software licensing services were provided by the UMN Office for Technology Commercialization. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Affiliations

Authors

Contributions

C.B., M.Y., C.L.M., J.S.P., and S.C.L. conceived the project. R.D. and C.L.M. designed and R.D. wrote the original analysis software, which was extended and re-implemented as a full analysis pipeline by S.W.S. with assistance from J.N. and H.N.W. S.C.L., J.S.P., and Y.Y. generated chemical–genetic interaction data for software development and testing. J.N., S.C.L., and J.S.P. performed extensive user testing of the software. H.O. provided the RIKEN Natural Product Depository compounds used in the example dataset. S.W.S. wrote the manuscript with input and editing from all authors.

Corresponding author

Correspondence to Chad L. Myers.

Ethics declarations

Competing interests

The authors declare competing financial interests. A license is required to use the BEAN-counter software (http://z.umn.edu/beanctr). It is free for academic use and must be purchased on a per-project basis for commercial use.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Piotrowski, J. S. et al. Nat. Chem. Biol. 13, 982–993 (2017): https://doi.org/10.1038/nchembio.2436

Morales, E. H. et al.. Nat. Commun. 8, 15320 (2017): https://doi.org/10.1038/ncomms15320

Integrated supplementary information

Supplementary Figure 1 The BEAN-counter configuration file provides necessary information for the pipeline.

Schematic showing how the configuration file coordinates the processing of pooled interaction screening data by specifying the location of the raw data, the structure of the PCR amplicons, and the mappings from genetic barcode to mutant and index tag to condition. Columns in bold are required in order to process data with BEAN-counter.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Simpkins, S.W., Deshpande, R., Nelson, J. et al. Using BEAN-counter to quantify genetic interactions from multiplexed barcode sequencing experiments. Nat Protoc 14, 415–440 (2019). https://doi.org/10.1038/s41596-018-0099-1

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing