GIGGLE: a search engine for large-scale integrated genome analysis

Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

doi:10.1038/nmeth.4556

Brief Communication
Published: 08 January 2018

GIGGLE: a search engine for large-scale integrated genome analysis

Nature Methods volume 15, pages 123–126 (2018)Cite this article

9942 Accesses
106 Citations
96 Altmetric
Metrics details

Subjects

Abstract

GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Indexing, searching, performance, and score calibration.**

**Figure 2: Visualization of GIGGLE scores from various searches.**

Go Get Data (GGD) is a framework that facilitates reproducible access to genomic data

Article Open access 12 April 2021

Michael J. Cormier, Jonathan R. Belyeu, … Aaron R. Quinlan

A machine-compiled database of genome-wide association studies

Article Open access 26 July 2019

Volodymyr Kuleshov, Jialin Ding, … Michael Snyder

Multi Locus View: an extensible web-based tool for the analysis of genomic data.

Article Open access 25 May 2021

Martin J. Sergeant, Jim R. Hughes, … Stephen Taylor

References

Quinlan, A.R. & Hall, I.M. Bioinformatics 26, 841–842 (2010).
Article CAS Google Scholar
Li, H. Bioinformatics 27, 718–719 (2011).
Article Google Scholar
Sheffield, N.C. & Bock, C. Bioinformatics 32, 587–589 (2016).
Article CAS Google Scholar
Favorov, A. et al. PLOS Comput. Biol. 8, e1002529 (2012).
Article CAS Google Scholar
Elmasri, R., Wuu, G.T.J. & Kim, Y.-J. The time index: an access structure for temporal data. in Proceedings of the 16th International Conference on Very Large Data Bases (VLDB '90) (eds. McLeod, D., Sacks-Davis, R. & Schek, H.-J.) 1–12 (Morgan Kaufmann, San Francisco, California, USA 1990).
Ernst, J. & Kellis, M. Nat. Methods 9, 215–216 (2012).
Article CAS Google Scholar
Layer, R.M., Skadron, K., Robins, G., Hall, I.M. & Quinlan, A.R. Bioinformatics 29, 1–7 (2013).
Article CAS Google Scholar
De, S., Pedersen, B.S. & Kechris, K. Brief. Bioinform. 15, 919–928 (2014).
Article CAS Google Scholar
Xiao, Y. et al. Bioinformatics 30, 801–807 (2014).
Article CAS Google Scholar
MacQuarrie, K.L. et al. Mol. Cell. Biol. 33, 773–784 (2013).
Article CAS Google Scholar
Farh, K.K.-H. et al. Nature 518, 337–343 (2015).
CAS Google Scholar
Mei, S. et al. Nucleic Acids Res. 45, D658–D662 (2017).
Article CAS Google Scholar
Splinter, E. et al. Genes Dev. 20, 2349–2354 (2006).
Article CAS Google Scholar
Nativio, R. et al. PLoS Genet. 5, e1000739 (2009).
Article Google Scholar
Xu, Y. et al. PLoS Genet. 12, e1005992 (2016).
Article Google Scholar
Carroll, J.S. et al. Cell 122, 33–43 (2005).
Article CAS Google Scholar
Theodorou, V., Stark, R., Menon, S. & Carroll, J.S. Genome Res. 23, 12–22 (2013).
Article CAS Google Scholar
Mohammed, H. et al. Nature 523, 313–317 (2015).
Article CAS Google Scholar
Hanstein, B. et al. Proc. Natl. Acad. Sci. USA 93, 11540–11545 (1996).
Article CAS Google Scholar
Li, W. et al. Mol. Cell 59, 188–202 (2015).
Article CAS Google Scholar
Periyasamy, M. et al. Cell Rep. 13, 108–121 (2015).
Article CAS Google Scholar
Mohammed, H. et al. Cell Rep. 3, 342–349 (2013).
Article CAS Google Scholar
Lizio, M. et al. Genome Biol. 16, 22 (2015).
Article CAS Google Scholar

Download references

Acknowledgements

We are grateful to the anonymous reviewers for their suggestions and comments. This research was funded by US National Institutes of Health awards to R.M.L. (K99HG009532) and A.R.Q. (R01HG006693, R01GM124355, U24CA209999).

Author information

Authors and Affiliations

Department of Human Genetics, University of Utah, Salt Lake City, Utah, USA
Ryan M Layer, Brent S Pedersen, Tonya DiSera, Gabor T Marth & Aaron R Quinlan
USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, Utah, USA
Ryan M Layer, Brent S Pedersen, Tonya DiSera, Gabor T Marth & Aaron R Quinlan
Department of Oncological Sciences, University of Utah, Huntsman Cancer Institute, Salt Lake City, Utah, USA
Jason Gertz
Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, USA
Aaron R Quinlan

Authors

Ryan M Layer
View author publications
You can also search for this author in PubMed Google Scholar
Brent S Pedersen
View author publications
You can also search for this author in PubMed Google Scholar
Tonya DiSera
View author publications
You can also search for this author in PubMed Google Scholar
Gabor T Marth
View author publications
You can also search for this author in PubMed Google Scholar
Jason Gertz
View author publications
You can also search for this author in PubMed Google Scholar
Aaron R Quinlan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.M.L. conceived and designed the study, developed GIGGLE, and wrote the manuscript. B.S.P. developed the GIGGLE score and the PYTHON and GO APIs. T.D. developed the web interface. G.T.M. provided input in the development of the web interface. J.G. conceived and designed the ChIP-seq experiment. A.R.Q. conceived and designed the study and wrote the manuscript.

Corresponding authors

Correspondence to Ryan M Layer or Aaron R Quinlan.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 GIGGLE indexing process.

(a) Three example annotation sets shown graphically (left) and encoded in files (right) by start position, end position, and ID. (b) GIGGLE's bulk indexing process. (c) The GIGGLE interval search process.

Supplementary Figure 2 The GIGGLE scores for all pairwise combinations of the ChIP-seq datasets for the MCF-7 cell line.

Group 1 highlights the relationship between CTCF, RAD21, and STAG1. Group 2 highlights ERS1, FOXA1, GATA3, and EPS300. Group 3 shows an unexpected relationship between H2AFX and GREB1.

Supplementary Figure 3 A web interface that integrates data from of Roadmap and the UCSC genome browser.

(a) Users specify either a single interval or file to upload as the query, and the server responds with the GIGGLE results from an index in a heatmap. In this case the index is of CHROMHMM prediction from Roadmap. The color of each cell indicates the GIGGLE score, and users can click on a cell (e.g., Myoblast enhancers, marked in red) for more information. (b) When a cell is selected by the user, a window opens that contains the list of intervals in that particular Roadmap cell type/genome state annotation that overlap the query. Each interval is a link that can be followed (e.g., chr1:33642000-33642800, marked in red) for more information. (c) When an interval is selected, that interval becomes a query to a GIGGLE index of the UCSC genome browser tracks. The result gives the set of tracks that contain an interval that overlaps the query, and the web interface opens a window with a “smartview” where only those tracks with overlaps are displayed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Layer, R., Pedersen, B., DiSera, T. et al. GIGGLE: a search engine for large-scale integrated genome analysis. Nat Methods 15, 123–126 (2018). https://doi.org/10.1038/nmeth.4556

Download citation

Received: 05 July 2017
Accepted: 06 December 2017
Published: 08 January 2018
Issue Date: 01 February 2018
DOI: https://doi.org/10.1038/nmeth.4556

This article is cited by

MYC activity at enhancers drives prognostic transcriptional programs through an epigenetic switch
- Simon T. Jakobsen
- Rikke A. M. Jensen
- Rasmus Siersbæk
Nature Genetics (2024)
An intronic LINE-1 regulates IFNAR1 expression in human immune cells
- Carmen A. Buttler
- Daniel Ramirez
- Edward B. Chuong
Mobile DNA (2023)
SARS-CoV-2 infection induces epigenetic changes in the LTR69 subfamily of endogenous retroviruses
- Ankit Arora
- Jan Eric Kolberg
- Vikas Bansal
Mobile DNA (2023)
An autoimmune pleiotropic SNP modulates IRF5 alternative promoter usage through ZBTB3-mediated chromatin looping
- Zhao Wang
- Qian Liang
- Mulin Jun Li
Nature Communications (2023)
FT-6876, a Potent and Selective Inhibitor of CBP/p300, is Active in Preclinical Models of Androgen Receptor-Positive Breast Cancer
- Maureen Caligiuri
- Grace L. Williams
- Sylvie M. Guichard
Targeted Oncology (2023)

GIGGLE: a search engine for large-scale integrated genome analysis

Subjects

Abstract

Access options

Similar content being viewed by others

Go Get Data (GGD) is a framework that facilitates reproducible access to genomic data

A machine-compiled database of genome-wide association studies

Multi Locus View: an extensible web-based tool for the analysis of genomic data.

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Integrated supplementary information

Supplementary Figure 1 GIGGLE indexing process.

Supplementary Figure 2 The GIGGLE scores for all pairwise combinations of the ChIP-seq datasets for the MCF-7 cell line.

Supplementary Figure 3 A web interface that integrates data from of Roadmap and the UCSC genome browser.

Supplementary information

Supplementary Text and Figures

Life Sciences Reporting Summary (PDF 128 kb)

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Software

Rights and permissions

About this article

Cite this article

This article is cited by

MYC activity at enhancers drives prognostic transcriptional programs through an epigenetic switch

An intronic LINE-1 regulates IFNAR1 expression in human immune cells

SARS-CoV-2 infection induces epigenetic changes in the LTR69 subfamily of endogenous retroviruses

An autoimmune pleiotropic SNP modulates IRF5 alternative promoter usage through ZBTB3-mediated chromatin looping

FT-6876, a Potent and Selective Inhibitor of CBP/p300, is Active in Preclinical Models of Androgen Receptor-Positive Breast Cancer

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Integrated supplementary information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links