SV-HotSpot: detection and visualization of hotspots targeted by structural variants associated with gene expression

Eteleeb, Abdallah M.; Quigley, David A.; Zhao, Shuang G.; Pham, Duy; Yang, Rendong; Dehm, Scott M.; Luo, Jingqin; Feng, Felix Y.; Dang, Ha X.; Maher, Christopher A.

doi:10.1038/s41598-020-71168-7

Download PDF

Article
Open access
Published: 28 September 2020

SV-HotSpot: detection and visualization of hotspots targeted by structural variants associated with gene expression

Scientific Reports volume 10, Article number: 15890 (2020) Cite this article

2901 Accesses
2 Citations
8 Altmetric
Metrics details

Subjects

Abstract

Whole genome sequencing (WGS) has enabled the discovery of genomic structural variants (SVs), including those targeting intergenic and intronic non-coding regions that eluded previous exome focused strategies. However, the field currently lacks an automated tool that analyzes SV candidates to identify recurrent SVs and their targeted sites (hotspot regions), visualizes these genomic events within the context of various functional elements, and evaluates their potential effect on gene expression. To address this, we developed SV-HotSpot, an automated tool that integrates SV candidates, copy number alterations, gene expression, and genome annotations (e.g. gene and regulatory elements) to discover, annotate, and visualize recurrent SVs and their targeted hotspot regions that may affect gene expression. We applied SV-HotSpot to WGS and matched transcriptome data from metastatic castration resistant prostate cancer patients and rediscovered recurrent SVs targeting coding and non-coding functional elements known to promote prostate cancer progression and metastasis. SV-HotSpot provides a valuable resource to integrate SVs, gene expression, and genome annotations for discovering biologically relevant SVs altering coding and non-coding genome. SV-HotSpot is available at https://github.com/ChrisMaherLab/SV-HotSpot.

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Saori Sakaue, Kathryn Weinand, … Soumya Raychaudhuri

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

Introduction

Structural variations (SVs) are genomic rearrangements that involve large chunks of DNA. These include deletion (loss of a genomic segment), duplication (gain multiple copies of a genomic segment), insertion (addition of a DNA sequence to the genome), inversion (one end of a genomic segment is reversed with the other end), and translocation (genomic rearrangement involving one or more chromosomes)¹. SVs are known to contribute to phenotypic differences and various diseases including cancers^2,3.

WGS has enabled comprehensive identification of various types of SVs targeting both the coding and non-coding tumor genome that may affect the activity or function of key driver oncogenes and tumor suppressors. This was demonstrated in a recent study of advanced prostate cancer integrating WGS, whole transcriptome, and ChIP-Seq data that showed tandem duplications involving non-coding regulatory regions are significantly associated with the expression of the androgen receptor (AR), a key driver of prostate cancer progression and metastasis^4,5. However, reproducibly performing such integrative analyses on the increasing quantity of whole genome data sets is limited by the current lack of automated tools for the discovery, visualization, and interpretation of recurrent SVs and their frequent targeted sites (hotspot regions).

To address this limitation, we developed SV-HotSpot, an automated tool that integrates multiple data types including SV candidates, gene expression, copy number alterations, and genome annotations to identify, annotate, and visualize recurrent SVs and their targeted hotspot regions and assess their potential consequences on the expression of nearby genes. We applied SV-HotSpot to the whole genome and transcriptome sequencing data from 101 metastatic prostate cancer patients⁴ and rediscovered both coding and non-coding recurrent SVs known to drive prostate cancer progression.

Results

Hotspots of structural variations in metastatic prostate cancer

SV-HotSpot enumerates SVs targeting genomic regions and utilizes a peak calling algorithm to identify regions with elevated frequency of these events (hereby referred to as peaks or hotspots, see Methods, Fig. 1). To demonstrate that SV-HotSpot is able to detect biologically relevant recurrent SVs, we applied it to WGS and matched RNA-Seq data from 101 metastatic prostate cancer patients⁴. To identify peaks corresponding to regulatory regions, we additionally included annotated enhancers⁶ and H3K27ac ChIP-Seq (Chromatin Immunoprecipitation Sequencing) data from prostate cancer patients⁷.

In total, we identified 296 SV hotspot sites associated with altered expression of 379 nearby genes (Fig. 2, Supplementary Tables S1, S2). SV-HotSpot identified and highly ranked hotspot sites harboring SVs associated with expression of many genes known to drive prostate cancer progression, metastasis, and treatment resistance (Fig. 2, Supplementary Table S1). Interestingly, various SV types were found to be recurrent and associated with altered expression of tumor suppressors and oncogenes including tandem duplication, chromosomal translocation, and copy number alteration.

Hotspots of tandem duplications targeting genes and regulatory elements

Recent studies highlighted the critical roles of tandem duplications in cancers including prostate cancer^4,5,8,9. SV-HotSpot identified various hotspots of tandem duplications targeting both coding and non-coding regions in metastatic prostate cancer. Most notably, SV-HotSpot detected a peak of recurrent SVs primarily comprised of tandem duplications targeting a non-coding enhancer located at ~ 625 kb upstream of AR, a key driver of prostate cancer progression, treatment resistance, and metastasis¹⁰ (Fig. 3). This region was also found to be amplified in 81% of patients (Fig. 3a, top track). SV-HotSpot detected a strong association between the presence of tandem duplications or copy number gain in this region with increased AR expression (Fig. 3b–d). Moreover, SV-HotSpot annotated this region with an active enhancer and an enriched H3K27ac occupancy (Fig. 3a, bottom two tracks). These results are consistent with the recent discovery of an AR enhancer that regulates AR gene expression and is highly frequently duplicated in prostate cancer metastasis^4,5,11.

In addition to the rediscovery of the AR enhancer, SV-HotSpot also detected peaks of frequent tandem duplications targeting both the coding and non-coding regions of MYC and FOXA1 loci (Fig. 4, Supplementary Tables S1, S2). Interestingly, SV-HotSpot reported an association of tandem duplication targeting both coding and non-coding regions of FOXA1 locus with increased gene expression (Fig. 4), consistent with a recent report¹². MYC was found to be overexpressed in the presence of copy number gain in its hotspot locus. There was also an enrichment of tandem duplication targeting both coding and non-coding regions near MYC (Supplementary Table S2).

Hotspots of deletions and translocations associated with ETS gene fusions

ETS gene fusions are well characterized somatic genome rearrangements that drive prostate cancer tumorigenesis and defines a distinct molecular subtype^13,14,15. SV-HotSpot identified and highly ranked hotspots that harbor deletions or chromosomal translocations resulting in gene fusions and increased expression of ETS transcription factor family genes. For instance, SV-HotSpot reported the highest ranked peak in chromosome 21 consisting primarily of recurrent deletion events that targeted the genomic region between TMPRSS2 and ERG genes corresponding to the TMPRSS2-ERG fusion (Fig. 5a). Deletion of the region between TMPRSS2 and ERG was found to be strongly associated with increased ERG expression (Fig. 5a). There was also an enrichment of translocations at ERG locus that were found to be associated with increased ERG expression. These translocations included events corresponding to ERG gene fusions with different 5′ partner genes (Fig. 5a).

Additionally, SV-HotSpot identified hotspots targeting other ETS genes including ETV1 and ETV4 (Fig. 5b,c, Supplementary Table S1). Interestingly, while many types of SVs were observed in ETV1 and ETV4 loci, only chromosomal translocations were found to be associated with increased expression of these genes. These enriched translocations corresponded to the inter-chromosomal rearrangements that created ETV1 gene fusions (10 patients, 10%, Fig. 5b) and ETV4 gene fusions (6 cases, 6%, Fig. 5c).

Hotspot of structural variations disrupting tumor suppressors

Tumor suppressor genes are often inactivated in cancers by various mechanisms including structural rearrangements^4,16,17. SV-HotSpot detected hotspot sites associated with decreased expression of well-characterized tumor suppressor genes including PTEN, TP53, RB1, and CDKN1B (Fig. 6, Supplementary Table S1). It is notable that various SV types were found to target PTEN and TP53 loci and regardless of the SV types, events targeting these tumor suppressors often found to be associated with decreased gene expression (Fig. 6). For example, all types of SVs targeting PTEN hotspot were associated with decreased PTEN expression including deletion, duplication, translocation, and inversion (Fig. 6a, Supplementary Table S1). This is consistent with previous report that PTEN is often disrupted by various forms of chromosomal rearrangements^4,16. Similarly, TP53 expression was decreased in the presence of deletion, translocation, and inversion (Fig. 6b, Supplementary Table S1).

Taken together, via reanalysis of public datasets, we showed that SV-HotSpot was able to detect hotspots of recurrent SVs known to contribute to prostate cancer development, progression, metastasis, and treatment resistance.

Discussion

Here, we present SV-HotSpot, an automated pipeline to identify, annotate, and visualize hotspots of recurrent SVs and evaluate their potential consequences on the expression of nearby genes. Despite the great success of recent studies in identifying recurrent SVs and assessing their impact^4,18, these approaches require significant amount of work and ad-hoc analyses to integrate multiple types of data and evaluate the potential effect of SVs on gene expression. SV-HotSpot seamlessly integrates and analyzes multiple data types including SV candidates, gene expression, copy number alterations, and functional elements to discover recurrent SV hotspots. Additionally, it comprehensively evaluates the associations between recurrent SVs and various genome annotation/functional elements and potential consequence on gene expression. Furthermore, SV-HotSpot provides useful visualizations to facilitate the interpretation of the results. As a fully automated tool, SV-HotSpot allows for customized and reproducible analyses.

SV-HotSpot uses a sliding window approach that is a generalization of the frequently used genomic binning approach and allows smoothing of the sample counts for effective peak calling. The use of peak calling algorithm to identify recurrent SVs in SV-HotSpot enables systematic identification of regions with statistically elevated frequency of SVs that are more likely functional. This approach is similar to those employed by GISTIC tool to identify genes recurrently targeted by copy number alteration¹⁹. Compared with GISTIC which focuses on identification of focal copy number alteration, SV-HotSpot additionally integrates a broad spectrum of structural variations, gene expression, and regulatory elements, and thus was able identify other types of recurrent SVs targeting regulatory elements driving gene expression such as tandem duplication of the AR enhancer. Our approach is also complementary to existing network biology approaches such as those utilizing molecular interaction data and information flow method to find association between genes and diseases²⁰.

Through our reanalysis of metastatic prostate cancer patient data, we demonstrated the utility of SV-HotSpot for detecting biologically relevant and well-characterized recurrent SVs that regulate the expression of nearby genes. We identified key prostate cancer driver genes as the most significantly associated genes with their commonly known recurrent SVs including tandem duplication of the AR enhancer, deletion of the TMPRSS2-ERG region, and genomic disruption of PTEN. Moreover, our thorough evaluation of expression association allowed us to identify specific types of SVs known to affect gene expression including those with lower frequency such as translocations resulting in gene fusions of ETV1 and ETV4, and tandem duplication affecting FOXA1. Overall, SV-HotSpot is a valuable tool for the cancer research community to integrate the growing whole genome, transcriptome, and epigenetic data to discover biologically relevant SV hotspots. Although the tool was applied to human cancer data in this study, it can also be applied to data from other species and diseases.

Methods

SV-HotSpot consists of four main steps (Fig. 1): (1) detection of SV hotspots, (2) annotation of SV hotspots, (3) evaluation of the association of hotspot SVs on expression of nearby genes, and (4) visualization.

In the first step, SV-HotSpot identifies regions with elevated frequency of SVs by utilizing a peak calling approach on counts of samples harboring SVs targeting sliding windows over each chromosome. First, it uses the SV candidates (in BEDPE, Browser Extensible Data Paired-End format) as an input and counts the number of samples harboring SV breakpoints (in the case of translocations, insertions, and inversions) or regions (in the case of duplications and deletions) overlapping with sliding windows. The entire duplication/deletion regions were considered because these events directly affect the contained genome elements by changing their copy numbers while other events only potentially affect elements near their break ends. SV-HotSpot then applies the peakPick peak calling algorithm²¹ to identify windows (referred to as ‘peaks’ thereafter) where counts are significantly higher than those of the surrounding windows. Peaks occurring in at least a certain percentage of SV samples (defined by users) are identified as potential peaks. Once all potential peaks are identified, SV-HotSpot applies a peak merging algorithm to group adjacent peaks with similar sample counts, as those are likely resulted from the same genome rearrangements that target the same sites. The peak merging algorithm works by first identifying clusters of adjacent peaks where any two contiguous peaks are within a predefined distance. Next, it selects the top peak (peak with the highest sample count) among a peak cluster and moves upstream and downstream to merge peaks until it observes k peaks (k is small, e.g. 1–3, predefined) with significant change of sample counts compared with the top peak (predefined parameter delta, e.g. 5%). This process is repeated until no peaks in the cluster remain. The merged peaks are then considered final peaks for subsequent analyses.

In the second step, identified peaks are annotated with nearby genes and overlapping regulatory elements such as enhancers and promoters, provided as input in BED (Browser Extensible Data) format using BEDTools²². All annotated peaks are then summarized and output in BED format.

In the third step, gene expression and copy number data are incorporated to evaluate if the presence of SVs at each peak is associated with altered expression of each nearby gene. A Wilcoxon rank-sum test or t-test (chosen by users) is used to compare the expression of the nearby gene between samples harboring and not harboring a hotspot SV. The same test is also performed within sample stratifications based on copy number status (gain, loss, or neutral) of the nearby gene. More specifically, SV-HotSpot applies 12 different comparisons (illustrated below) in order to determine the association between a hotspot and a nearby gene.

1.
Comparison of the expression of a nearby gene between samples harboring hotspot SVs and those not harboring hotspot SVs without considering copy number status of the gene to determine the overall association between recurrent SVs and the expression of the gene.
2.
Similar to (1), five comparisons are also performed between samples harboring each of the five individual SV types (duplication, deletion, translocation, insertion, and inversion) targeting the hotspot and those without any SVs targeting the hotspot to identify whether the overall expression association is derived by specific SV types.
3.
Comparison of the expression of a nearby gene between samples harboring hotspot SVs and those not harboring hotspot SVs but only among samples without copy number alteration of the gene to determine whether the association is derived by SVs without the confounding impact of copy number alterations of the gene.
4.
Similar to (3), five comparisons are also performed between samples harboring each of the five individual SV types (duplication, deletion, translocation, insertion, and inversion) targeting the hotspot and those without any SVs targeting the hotspot to identify whether the expression association is derived by specific SV types.

We determine that the presence of SVs at a peak is associated with the expression of a nearby gene if any of the above comparisons results in the rejection of the null hypothesis that there is no difference in gene expression between groups. To achieve this, the Fisher’s method²³ is used to combine the p-values of these tests. Subsequently, the false discovery rate (FDR) was estimated using the Benjamini–Hochberg²⁴ using the Fisher’s combined p-values.

Additionally, SV-HotSpot groups dependent peaks that are likely driven by the same SV events and have similar consequences on the expression of nearby genes into peak families. To achieve this, after the peaks are identified, those associated with the same gene are tested for dependency using a Fisher’s exact test. If peaks are found to be dependent (significant overlap of the samples harboring SVs between the peaks), they are grouped as a peak family. The top-ranked peak (the peak with highest count of samples harboring the hotspot SVs) is reported as the representative of the peak family.

Finally, SV-HotSpot generates multiple visualizations (Fig. 2 top panel, Fig. 3) for interpretation of the genomic context and the association between SVs and gene expression. In these visualizations, SV-HotSpot overlays multiple tracks to show copy number alterations, SV breakpoint aggregation, segments of duplications and deletions, gene and regulatory element annotation, and ChIP-Seq coverage in close proximity to the peaks and its nearby genes (Fig. 3a). In addition, the expression of nearby genes is plotted to highlight associations with recurrent SVs (Fig. 3b), with different types of SVs (Fig. 3c), and with copy number status of both the peak and nearby genes (Fig. 3d). SV-HotSpot also provides an additional visualization of the distribution of identified peaks on each chromosome (Fig. 2, top panel). Furthermore, SV-HotSpot generates a custom track file for each chromosome that can be viewed on the UCSC Genome Browser.

For analyses reported in the Results section, SV-HotSpot was run using a sliding window size of 100 kb, step size of 30 kb, peak merging distance of 50 kb, default parameters for peakPick, and peak merging parameters k = 1, delta = 5%. Only peaks smaller than 500 kb (except for those associated with altered expression of a COSMIC census gene²⁵) and present in at least 15% of samples with an FDR < 0.05 and at least one of 12 expression associations significant at a p-value < 0.05 (Wilcoxon test) were retained. Additionally, only genes with mean expression > 10 TPM (Transcripts Per Million) in a group from a significant comparison were retained.

Data availability

SV-HotSpot is a Linux-based command-line pipeline implemented in R and Perl and can be run as a Docker container or Bioconda package. SV-HotSpot is available at https://github.com/ChrisMaherLab/SV-HotSpot.

References

Yang, L. et al. Diverse mechanisms of somatic structural variations in human cancer genomes. Cell 153, 919–929 (2013).
Article CAS Google Scholar
Feuk, L., Carson, A. R. & Scherer, S. W. Structural variation in the human genome. Nat. Rev. Genet. 7, 85–97 (2006).
Article CAS Google Scholar
Stankiewicz, P. & Lupski, J. R. Structural variation in the human genome and its role in disease. Annu. Rev. Med. 61, 437–455 (2010).
Article CAS Google Scholar
Quigley, D. A. et al. Genomic hallmarks and structural variation in metastatic prostate cancer. Cell 174, 758-769.e9 (2018).
Article CAS Google Scholar
Viswanathan, S. R. et al. Structural alterations driving castration-resistant prostate cancer revealed by linked-read genome sequencing. Cell 174, 433-447.e19 (2018).
Article CAS Google Scholar
Fishilevich, S. et al. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. Database Oxford Press (2017). https://doi.org/10.1093/database/bax028
Article Google Scholar
Kron, K. J. et al. TMPRSS2-ERG fusion co-opts master transcription factors and activates NOTCH signaling in primary prostate cancer. Nat. Genet. 49, 1336–1345 (2017).
Article CAS Google Scholar
Wu, Y.-M. et al. Inactivation of CDK12 delineates a distinct immunogenic class of advanced prostate cancer. Cell 173, 1770-1782.e14 (2018).
Article CAS Google Scholar
Menghi, F. et al. The tandem duplicator phenotype is a prevalent genome-wide cancer configuration driven by distinct gene mutations. Cancer Cell 34, 197-210.e5 (2018).
Article CAS Google Scholar
Heinlein, C. A. & Chang, C. Androgen receptor in prostate cancer. Endocr. Rev. 25, 276–308 (2004).
Article CAS Google Scholar
Takeda, D. Y. et al. A somatically acquired enhancer of the androgen receptor is a noncoding driver in advanced prostate cancer. Cell 174, 422-432.e13 (2018).
Article CAS Google Scholar
Parolia, A. et al. Distinct structural classes of activating FOXA1 alterations in advanced prostate cancer. Nature 571, 413–418 (2019).
Article CAS Google Scholar
Tomlins, S. A. et al. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310, 644–648 (2005).
Article ADS CAS Google Scholar
Tomlins, S. A. et al. Role of the TMPRSS2-ERG gene fusion in prostate cancer. Neoplasia 10, 177–188 (2008).
Article CAS Google Scholar
Clark, J. P. & Cooper, C. S. ETS gene fusions in prostate cancer. Nat. Rev. Urol. 6, 429–439 (2009).
Article CAS Google Scholar
Zhang, Y. et al. A pan-cancer compendium of genes deregulated by somatic genomic rearrangement across more than 1,400 cases. Cell Rep. 24, 515–527 (2018).
Article CAS Google Scholar
Chen, X. et al. Recurrent somatic structural variations contribute to tumorigenesis in pediatric osteosarcoma. Cell Rep. 7, 104–112 (2014).
Article CAS Google Scholar
Fraser, M. et al. Genomic hallmarks of localized, non-indolent prostate cancer. Nature 541, 359–364 (2017).
Article ADS CAS Google Scholar
Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).
Article Google Scholar
Chen, Y. et al. Identifying potential cancer driver genes by genomic data integration. Sci. Rep. 3, 3538 (2013).
Article Google Scholar
Weber, C. M., Ramachandran, S. & Henikoff, S. Nucleosomes are context-specific, H2A.Z-modulated barriers to RNA polymerase. Mol. Cell 53, 819–830 (2014).
Article CAS Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS Google Scholar
Fisher, R. A. et al. Statistical Methods for Research Workers (Springer, Berlin, 1934).
MATH Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol.) 57, 289–300 (1995).
MathSciNet MATH Google Scholar
Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. Nucl. Acids Res. 45, D777–D783 (2017).
Article CAS Google Scholar

Download references

Funding

This work has been supported by a Prostate Cancer Foundation Challenge Award (to C.A.M., F.Y.F., and S.M.D.), a Prostate Cancer Foundation Young Investigator Award (to D.A.Q., R.Y., and S.G.Z.) and a BRCA Foundation Young Investigator Award (to D.A.Q.), an American Cancer Society Institutional Research Grant Number IRG-18- 158-61 (to H.X.D.), and an NIH National Cancer Institute Grant Number R01CA174777 (to S.M.D.).

Author information

These authors jointly supervised this work: Ha X. Dang and Christopher A. Maher.

Authors and Affiliations

McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, 63110, USA
Abdallah M. Eteleeb, Ha X. Dang & Christopher A. Maher
Department of Internal Medicine, Washington University School of Medicine, St. Louis, MO, 63110, USA
Abdallah M. Eteleeb, Duy Pham, Ha X. Dang & Christopher A. Maher
Department of Urology, University of California San Francisco (UCSF), San Francisco, CA, 94158, USA
David A. Quigley
Helen Diller Family Comprehensive Cancer Center, University of California San Francisco (UCSF), San Francisco, CA, 94158, USA
David A. Quigley & Felix Y. Feng
Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, 48109, USA
Shuang G. Zhao
The Hormel Institute, University of Minnesota, Austin, MN, 55912, USA
Rendong Yang
Masonic Cancer Center, University of Minnesota, Minneapolis, MN, 55455, USA
Scott M. Dehm
Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, 55455, USA
Scott M. Dehm
Department of Surgery, Washington University School of Medicine, St. Louis, MO, 63110, USA
Jingqin Luo
Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, 63110, USA
Jingqin Luo, Ha X. Dang & Christopher A. Maher
Department of Radiation Oncology, University of California San Francisco (UCSF), San Francisco, CA, 94143, USA
Felix Y. Feng
Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO, 63105, USA
Christopher A. Maher

Authors

Abdallah M. Eteleeb
View author publications
You can also search for this author in PubMed Google Scholar
David A. Quigley
View author publications
You can also search for this author in PubMed Google Scholar
Shuang G. Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Duy Pham
View author publications
You can also search for this author in PubMed Google Scholar
Rendong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Scott M. Dehm
View author publications
You can also search for this author in PubMed Google Scholar
Jingqin Luo
View author publications
You can also search for this author in PubMed Google Scholar
Felix Y. Feng
View author publications
You can also search for this author in PubMed Google Scholar
Ha X. Dang
View author publications
You can also search for this author in PubMed Google Scholar
Christopher A. Maher
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.M.E., H.X.D., and C.A.M. conceived the study. A.M.E. and H.X.D. designed and developed the SV-HotSpot pipeline. D.A.Q., S.G.Z., F.Y.F., R.Y., and S.M.D. provided the datasets and insights on the SV-HotSpot development. J.L. provided statistical insight that has been implemented in the tool. A.M.E. analyzed the data. A.M.E., H.X.D., C.A.M. wrote the manuscript. D.A.Q., S.G.Z., F.Y.F., S.M.D., J.L. critically revised the manuscript. H.X.D. and C.A.M. supervised the project. All authors reviewed, edited and approved the final manuscript.

Corresponding author

Correspondence to Christopher A. Maher.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Tables.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Eteleeb, A.M., Quigley, D.A., Zhao, S.G. et al. SV-HotSpot: detection and visualization of hotspots targeted by structural variants associated with gene expression. Sci Rep 10, 15890 (2020). https://doi.org/10.1038/s41598-020-71168-7

Download citation

Received: 22 July 2019
Accepted: 09 August 2020
Published: 28 September 2020
DOI: https://doi.org/10.1038/s41598-020-71168-7

This article is cited by

SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
- Yiqun Zhang
- Fengju Chen
- Chad J. Creighton
BMC Bioinformatics (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.