LASCA: loop and significant contact annotation pipeline

Luzhin, Artem V.; Golov, Arkadiy K.; Gavrilov, Alexey A.; Velichko, Artem K.; Ulianov, Sergey V.; Razin, Sergey V.; Kantidze, Omar L.

doi:10.1038/s41598-021-85970-4

Download PDF

Article
Open access
Published: 18 March 2021

LASCA: loop and significant contact annotation pipeline

Artem V. Luzhin^1,2,
Arkadiy K. Golov¹,
Alexey A. Gavrilov^1,2,
Artem K. Velichko^1,2,3,
Sergey V. Ulianov¹,
Sergey V. Razin¹ &
…
Omar L. Kantidze¹

Scientific Reports volume 11, Article number: 6361 (2021) Cite this article

2187 Accesses
4 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Chromatin loops represent one of the major levels of hierarchical folding of the genome. Although the situation is evolving, current methods have various difficulties with the accurate mapping of loops even in mammalian Hi-C data, and most of them fail to identify chromatin loops in animal species with substantially different genome architecture. This paper presents the loop and significant contact annotation (LASCA) pipeline, which uses Weibull distribution-based modeling to effectively identify loops and enhancer–promoter interactions in Hi-C data from evolutionarily distant species: from yeast and worms to mammals. Available at: https://github.com/ArtemLuzhin/LASCA_pipeline.

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Wenpin Hou & Zhicheng Ji

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Saori Sakaue, Kathryn Weinand, … Soumya Raychaudhuri

Introduction

Techniques exploiting the proximity ligation procedure (so-called C-methods) have significantly improved our understanding of the spatial (3D) genome organization. C-methods have confirmed the existence of chromosomal territories that are spatially compartmentalized into active and repressed chromatin domains, referred to as A and B compartments¹. These megabase-scale compartments are partitioned into self-interacting structures, termed topologically associating domains (TADs). The distinguishing feature of TADs is that spatial contacts of remote genomic elements are more frequent within TADs than between individual TADs^2,3,4. Boundaries between individual TADs are enriched with cohesin complex and CCCTC-binding factor (CTCF). Recent data show that TADs are formed via dynamic DNA loop extrusion and may harbor smaller contact domains, some of which are chromatin loops^5,6. Along with chromatin compartments and TADs, chromatin loops represent one of the major levels of hierarchical folding of the genome. Although most of the loops in mammals are anchored by CTCF and cohesin, several other proteins can mediate long-distance genomic interactions: Yin Yang 1 (YY1), zinc finger protein 143 (ZNF143), LIM domain-binding factor 1 (LDB1)⁷. All the three are involved in establishing enhancer-promoter interactions either by direct binding and bridging specific genomic sites (YY1 and ZNF143)^8,9, or by binding to a subset of transcription factors (LDB1)¹⁰. Species that lack CTCF-dependent loops can however utilize DNA extruding complexes such as condensin and cohesin to organize their genomes or genome parts into consecutive similar-sized chromatin loops. Specifically, Caenorhabditis elegans X chromosome contains dozens of loops associated with a condensin-like dosage compensation complex (DCC)^11,12. In budding yeast, S-phase chromatin forms consecutive loops which base points often colocalize with binding sites of cohesin protein Scc1¹³.

In contrast to chromatin compartments and TADs, chromatin loops have clear biological roles, such as bringing gene promoters to their cognate cis-regulatory elements¹⁴. Hence, the identification of loops is an essential part of most studies that involve Hi-C-based 3D genome analyses. Methods that have been developed to detect chromatin loops and/or statistically significant genomic interactions from Hi-C contact maps comprise two groups: (1) statistical/probabilistic model-based methods (e.g. Fit-Hi-C¹⁵, HiC-DC¹⁶), and (2) peak-calling methods (e.g. HiCCUPS¹⁷, MUSTACHE¹⁸, SIP¹¹, cLoops¹⁹). All of them are primarily focused on mapping the loops in Hi-C datasets from mammals, and experience some (in most cases insurmountable) challenges working with Hi-C datasets from animal species with substantially different genome architecture. Here, we present the LASCA pipeline that uses Weibull distribution-based modeling to effectively identify chromatin loops, including enhancer-promoter interactions, in Hi-C data from different animal species (human, mouse, nematode, budding yeast). Our results demonstrate that LASCA-detected loops are (1) reproducible, (2) highly supported by aggregate peak analyses and genomic/epigenomic correlates of loop formation, (3) validated by protein-centric chromatin conformation methods (ChIA-PET and HiChIP). We have compared LASCA with the most commonly used methods from each of the abovementioned groups (HiCCUPS and Fit-Hi-C) and with a very recent approach MUSTACHE. Working with mammalian Hi-C data, LASCA showed very similar results to HiCCUPS and MUSTACHE, and even outperformed HiCCUPS in detecting CTCF-independent loops. In contrast to methods compared, LASCA could also detect chromatin loops in C. elegans and S. cerevisiae, which makes it an omni-purpose approach.

Results and discussion

Description of the LASCA pipeline

In Hi-C heatmaps, loops appear as bright dots located at different distances from a central diagonal. Here, we applied Weibull distribution-based modeling^20,21,22 to identify these significant interactions. Comparison of several statistical distributions (Weibull, normal, log-normal, and gamma) performed in Sanyal et al.²⁰ clearly showed that the Weibull distribution fits Hi-C interaction frequency the best. We further analyzed the quality of the Weibull distribution fitting of Hi-C interaction frequency at different distance ranges and found that its good performance is distance-independent (Supplementary Fig. S1). The LASCA pipeline works with corrected Hi-C matrices in a Cooler format²³. First, the Weibull distribution-based statistical background model is fitted to each diagonal of the Hi-C matrix (Fig. 1). The p-value for every pixel in the heatmap is calculated as the probability of finding a corresponding model pixel with the same or higher intensity. To obtain corrected p-values—q-values, the Benjamini–Hochberg method is applied. Optionally, q-values are additionally corrected in accordance with the scaling of a particular chromosome. A q-value threshold level is defined by a user to determine significant pixels (contacts). Adjacent significant pixels are grouped into clusters; for each cluster, the center is defined, and its coordinates are retrieved and considered as the loop coordinates. Identified loops may be filtered according to their aggregate peak analysis (APA)¹⁷ or peak analysis (PA)¹¹ scores, signal intensity, and signal enrichment over random signals located at the same distance from the central diagonal. The loops should also display a signal decay from the central loop pixel. These filters are optional and should be used depending on the animal species under study. Specifically, using these filters with mammalian Hi-C data will enhance the accuracy of loop annotation, whereas utilizing the filters with the yeast genome appears to be less useful.

Identification of chromatin loops with the LASCA pipeline

We applied the LASCA pipeline to identify chromatin loops in Hi-C data from evolutionarily distant species: H. sapiens (GM12878 cells¹⁷), M. musculus (CH12.LX cells¹⁷), C. elegans²⁴, and S. cerevisiae (S-phase cells¹³). We managed to annotate a significant number of loops in each case (Fig. 2a). Visual inspection of loop-annotated Hi-C heatmaps suggested the good accuracy of the LASCA in loop identification, particularly in worm and yeast datasets (Fig. 1a). We further evaluated the quality of loop mapping using a widely accepted metaplot analysis^17,25 (Fig. 2b). Loops in human and mouse datasets displayed a classic decayed signal from the loop center to surrounding regions with a crosshair pattern that corresponded to loop extrusion¹¹. Similar though less saturated patterns were observed for worm and yeast loops (Fig. 2b). We also showed that anchors of loops identified by LASCA were enriched in ChIP-seq mapped binding sites of proteins important for looping: human or mouse CTCF, condensin subunit DPY-27 of C. elegans, and yeast cohesin subunit Scc1 (Fig. 2c). Approximately 60% of the LASCA-detected loops showed the enrichment by the corresponding chromatin architecture factor at least at one of their base points (hCTCF—64%; mCTCF—56%; DPY-27—76% for X chromosome; Scc1—57%). Together, these results demonstrate the good performance and versatility of the LASCA pipeline.

To assess the reproducibility of LASCA loop calls we compared its performance on two replicates of GM12878 cell line Hi-C matrices. The results indicate that LASCA has a very high self-consistency level (73.4%/83.7%; Fig. 3a) which is comparable or even better than those of HiCCUPS and MUSTACHE (Fig. 3a). To validate LASCA-predicted loops, we compared them with loops predicted in GM12878 cells by protein-centric C-methods, CTCF ChIA-PET²² and CTCF HiChIP²⁶. The vast majority of the chromatin loops detected by LASCA were recapitulated by ChIA-PET (84.6%) and HiChIP (68%) loops (Fig. 3b, c). Virtually the same performance in this analysis showed MUSTACHE and HiCCUPS (Fig. 3b, c). Taken together these results confirm the high accuracy and reproducibility of the LASCA pipeline.

We compared the performance of LASCA with the most commonly used chromatin loop- and significant contact-detecting methods, HiCCUPS and Fit-Hi-C, and with a very recent approach MUSTACHE. It was not possible to adequately compare LASCA with Fit-Hi-C because the latter identified several orders of magnitude more contacts as compared to LASCA (~ 48 million in GM12878 cell line and ~ 34.5 thousand in S. cerevisiae). Nevertheless, we found that most of the LASCA-identified chromatin loops belonged to significant genomic contacts detected by Fit-Hi-C (97.3% of GM12878 loops, and 94.6% of yeast loops). LASCA identified approximately the same number of loops as MUSTACHE and two times more loops than HiCCUPS in Hi-C datasets from both human and mouse (Fig. 4a and Supplementary Fig. S2). Loops identified by LASCA had a larger median size as compared to HiCCUPS-called loops (Fig. 4b). Direct comparison of loops called by LASCA, HiCCUPS, and MUSTACHE demonstrated a good overlap (Fig. 4c). Approximately 70% of HiCCUPS loops and 85% of MUSTACHE loops in GM12878 cells were identified by LASCA (Fig. 4c).

It is noteworthy that a significant portion of LASCA loops (60%) was not mapped by HiCCUPS (Fig. 4c). To find out the characteristics of these additional LASCA-identified loops in human cells, we performed metaplot analysis and checked the presence of CTCF binding sites and specific epigenetic features (ATAC-seq and histone H3K27Ac peaks) at the bases of i) loops identified by both LASCA and HiCCUPS, ii) HiCCUPS-specific loops, and iii) LASCA-specific loops (Supplementary Fig. S3). The results obtained clearly showed that HiCCUPS detected mostly CTCF-dependent loops with a high proportion of active enhancers marked by H3K27 acetylation (Supplementary Fig. S3). LASCA-specific loops, at the same time, were slightly different in structure as evidenced from metaplot analysis, did not strongly depend on CTCF, and were not associated with the active enhancers (Supplementary Fig. S3). In summary, these results suggest that LASCA identifies additional loops that are not annotated by HiCCUPS.

Identification of enhancer-promoter interactions with the LASCA pipeline

The LASCA pipeline is suitable for the identification of enhancer-promoter interactions (Fig. 5a). In this case, LASCA is run without chromatin loop-specific filters and retrieves coordinates of all significant genomic contacts on distances of up to two megabases. Anchors of the contacts are then intersected with a list of gene promoters; the contacts (loops) with one of the two anchors coinciding with a promoter are selected for further analysis. In these loops, the second anchor is considered to be an enhancer. To verify that the LASCA did annotate enhancers, we tested the epigenetic profile around the predicted enhancers. This profiling clearly illustrated that regions that were predicted to be enhancers displayed typical enhancer-specific epigenetic marks (Fig. 5b).

Conclusions

LASCA allows mapping of chromatin loops as well as enhancer–promoter interaction in Hi-C datasets obtained from animal species with different genome size and genome organization complexity, such as human, mouse, worm, and yeast. To annotate representative loops/contacts, LASCA requires minimal adjustments and filtering, particularly when analyzing worm or yeast data. High-quality performance with Hi-C data from C. elegans and S. cerevisiae distinguishes LASCA from most other chromatin loop callers. LASCA, the protocol, and suite of scripts, are publicly available at https://github.com/ArtemLuzhin/LASCA_pipeline.

Methods

Loop annotation

The LASCA pipeline consists of three main steps (Fig. 1). In the first step, the Weibull distribution-based statistical background model is fitted to each diagonal of the corrected Hi-C matrix; for each pixel, p-values are calculated as the probability of finding a model pixel with the same or higher intensity, and FDR correction of the p-values is performed to obtain corresponding q-values (default argument: 0.1). Additionally, q-values may be corrected in accordance with the scaling of a particular chromosome (default argument: turned off). Briefly, for the selected range of diagonals of the Hi-C matrix, the average value of the contact frequency for each diagonal is calculated. Then, all values in the obtained set of average values of the contact frequencies are divided by the average value of the contact frequency in the first diagonal, thus forming a set of normalization coefficients with a value = 1 in the first diagonal. The q-values in each diagonal are divided by the corresponding normalization coefficient determined for this particular diagonal. The q-value threshold is defined by a user (default argument: 0.1) to determine significant pixels (contacts).

On the second step, adjacent significant pixels are grouped into clusters using a density-based spatial clustering of applications with noise (DBSCAN) algorithm (default argument: minimum cluster size is three pixels). Pixels are assigned as neighbors related to a particular cluster if the maximum Euclidean distance between these two pixels = 1. The size of the cluster (in pixels) may be specified by a user. For each cluster, the center is defined (default argument: the brightest pixel), and its coordinates are retrieved and considered as the loop coordinates. The cluster center is defined as either the arithmetic mean of the x and y coordinates of pixels in a cluster or a pixel in a cluster possessing maximum intensity (default option).

On the third step, identified loops may be subjected to various filters, such as the enrichment of signal over background (APA and PA scores (default arguments: > = 1.9 and > = 1, correspondingly), signal intensity, signal decay from the center of a cluster, and signal enrichment over random signals at the same distance. All steps and parameters in the LASCA pipeline can be turned on/off and adjusted by the user (default argument: turned off).

APA-score¹⁷ was calculated as the ratio of the intensity of the central pixel of the loop to the average intensity of the right corner of the 11 × 11 pixel-circle around the center of the loop. PA-score¹¹ was calculated as the average ratio between the nearest pixels to the center of the loop and the average intensity of the right corner of the 11 × 11 pixel-circle around the center of the loop.

Using the LASCA pipeline, we identified loops in several organisms. For human cell line GM12878 Hi-C maps, loops were annotated at 5 and 10 kb resolution followed by merging of overlapped loops and filtering them by signal enrichment over random signals located at the same distance. The following parameters were used for 5 and 10 kb Hi-C maps: q = 0.95, adjust_by_scale = True, q_value_trhd = 0.2, scaling_q_value_trhd = 0.2, min_cluster = 2, filter_zeros = 2, filter_PA = 1, filter_APA = 1.9, filter_intensity = 0.3. For mouse cell line CH-12LX, loops were identified at 10 kb resolution with the following parameters: q = 0.95, adjust_by_scale = True, q_value_trhd = 0.1, scaling_q_value_trhd = 0.05, min_cluster = 3, filter_zeros = 2, filter_PA = 1, filter_APA = 1.8, filter_intensity = 0.1. For C. elegans, loops were annotated at 5 and 10 kb resolution with a subsequent merging of overlapped loops. The following parameters were used: q = 0.95, adjust_by_scale = False, q_value_trhd = 0.05, min_cluster = 2, filter_zeros = 2, filter_PA = 1, filter_APA = 1.7, filter_intensity = 0.1. Finally, for S. cerevisiae, loops were mapped with following settings: q = 0.95, adjust_by_scale = False, q_value_trhd = 0.01, min_cluster = 3, filter_zeros = 2, filter_intensity = 0.1.

HiCCUPS loops for GM12878 (primary) and CH12-LX cells were obtained from Rao et al.¹⁷ (GSE63525). To identify loops in GM12878 cells using MUSTACHE, we annotated the loops separately on the 5 and 10 kb maps with the following parameters: -pt 0.05, -st 0.88, -sz 1.6, -oc 2, -i 10. Then we merged the loop lists found at different resolutions. Significant contacts in GM12878 cells and S. cerevisiae were also annotated using FitHiC2 with default settings, except for the -U option (2 Mb and 40 kb, respectively). We cut off significant contacts by q < = 10⁻⁹.

We used Bedtools to count the overlapping of the loops. We considered the loops overlapped if at least 70% of the reciprocal intersection of regions between loop base points was observed.

Metaplot and enrichment analyses

We constructed metaplots for each of the selected organisms using Coolpup.py²⁵ with default settings, except –pad parameter, which was 10 for S. cerevisiae and 50 for C. elegans.

To analyze the enrichment of proteins involved in looping, we applied deepTools2²⁷ with—reference Point TSS (upstream loop anchor), and -a and -b parameters were selected as half of the mean loop size for each Hi-C dataset. In the case of enhancer-promoter interaction analysis, we set -a and -b parameters to 2 Mb.

ChIA-PET and HiChIP loop identification

A table containing the contacts of the CTCF-PET clusters was obtained from Tang et al.²² (GSM1872886). Following the methods of the original paper, we removed from the consideration CTCF-PET clusters that had less than four interactions. Then we binned the resulting CTCF-PET cluster interaction map using a 10 kb window. The resulting interaction coordinates of 10 kb windows, we considered as CTCF-mediated loops. The data for CTCF-HiChIP in .hic format was obtained from Mumbach et al.²⁶ (GSE115524). In accordance with the original paper, we annotated CTCF-mediated interactions using HiCCUPS with the following parameters: -m 500 -r 5000, 10,000 -f 0.1,0.1 -p 4,2 -i 7,5 -d 20,000, 20,000.

CTCF, ATAC-seq, and H3K27AC peaks

Data for GM12878 cells were obtained from the ENCODE (ENCFF410XEP, ENCFF411MHX, ENCFF710VEH). Coordinates of ATAC-seq peaks in hg38 have been translated to hg19 using the LiftOver utility (https://genome.ucsc.edu/cgi-bin/hgLiftOver). The base of the loop was considered to contain a peak if it contained at least one peak from the corresponding mark.

Identification of enhancer-promoter interactions

To identify enhancer-promoter interactions, we used the LASCA pipeline on GM12878 Hi-C data at 25 kb resolution without filters and clustering. Next, we selected significant contacts at scales up to 25 kb. We defined promoters as regions from the gene transcription start site (TSS) to 1 kb upstream of that TSS. We intersected these promoters and significant contacts and left only those contacts for which one of the anchors fell inside the promoter region; another anchor, therefore, was assigned as an enhancer.

Data availability

LASCA, the protocol, and suite of scripts, are publicly available at https://github.com/ArtemLuzhin/LASCA_pipeline. Hi-C datasets for C. elegans (GSE132640), and GM12878 (GSE63525) and CH12-LX cell lines (GSE63525) were downloaded from NCBI GEO. The Hi-C dataset for S. cerevisiae (PRJNA427106) was downloaded from NCBI Bioproject. ChIP-seq and DNAse I sensitivity datasets for GM12878 (ENCFF312KXX, ENCFF158GBQ, ENCFF167NBF, ENCFF180LKW, ENCFF682WPF, ENCFF776OVW), and CH12-LX (ENCFF025UEN) were downloaded from ENCODE Project. ChIP-seq for S. cerevisiae (GSM1712307) and C. elegans (GSM3680196) were downloaded from NCBI GEO.

References

Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Article ADS CAS Google Scholar
Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472 (2012).
Article CAS Google Scholar
Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).
Article ADS CAS Google Scholar
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Article ADS CAS Google Scholar
Fudenberg, G. et al. Formation of chromosomal domains by loop extrusion. Cell Reports 15, 2038–2049 (2016).
Article CAS Google Scholar
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl. Acad. Sci. USA 112, E6456–6465 (2015).
Article CAS Google Scholar
Kyrchanova, O. & Georgiev, P. Mechanisms of enhancer-promoter interactions in higher eukaryotes. Int. J. Mol. Sci. 22, 671 (2021).
Article CAS Google Scholar
Bailey, S. D. et al. ZNF143 provides sequence specificity to secure chromatin interactions at gene promoters. Nat. Commun. 2, 6186 (2015).
Article ADS Google Scholar
Weintraub, A. S. et al. YY1 is a structural regulator of enhancer-promoter loops. Cell 171, 1573–1588 (2017).
Article CAS Google Scholar
Krivega, I., Dale, R. K. & Dean, A. Role of LDB1 in the transition from chromatin looping to transcription activation. Genes Dev. 28, 1278–1290 (2014).
Article CAS Google Scholar
Rowley, M. J. et al. Analysis of Hi-C data using SIP effectively identifies loops in organisms from C. elegans to mammals. Genome Res. 30, 447–458 (2020).
Article CAS Google Scholar
Anderson, E. C. et al. X chromosome domain architecture regulates Caenorhabditis elegans lifespan but not dosage compensation. Dev. Cell 51, 192–207 (2019).
Article CAS Google Scholar
Ohno, M. et al. Sub-nucleosomal genome structure reveals distinct nucleosome folding motifs. Cell 176, 520–534 (2019).
Article CAS Google Scholar
Spitz, F. & Furlong, E. E. Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012).
Article CAS Google Scholar
Ay, F., Bailey, T. L. & Noble, W. S. Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome Res. 24, 999–1011 (2014).
Article CAS Google Scholar
Carty, M. et al. An integrated model for detecting significant chromatin interactions from high-resolution Hi-C data. Nat. Commun. 8, 15454 (2017).
Article ADS CAS Google Scholar
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Article CAS Google Scholar
Roayaei Ardakany, A., Gezer, H. T., Lonardi, S. & Ay, F. Mustache: multi-scale detection of chromatin loops from Hi-C and micro-C maps using scale-space representation. Genome Biol. 21, 256 (2020).
Article Google Scholar
Cao, Y. et al. Accurate loop calling for 3D genomic data with cLoops. Bioinformatics 36, 666–675 (2020).
CAS PubMed Google Scholar
Won, H. et al. Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527 (2016).
Article ADS Google Scholar
Sanyal, A., Lajoie, B. R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).
Article ADS CAS Google Scholar
Tang, Z. et al. CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription. Cell 163, 1611–1627 (2015).
Article CAS Google Scholar
Abdennur, N. & Mirny, L. A. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 36, 311–316 (2020).
CAS PubMed Google Scholar
Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).
Article ADS CAS Google Scholar
Flyamer, I. M., Illingworth, R. S. & Bickmore, W. A. Coolpup.py: versatile pile-up analysis of Hi-C data. Bioinformatics 36, 2980–2985 (2020).
Article CAS Google Scholar
Mumbach, M. R. et al. HiChIRP reveals RNA-associated chromosome conformation. Nat. Methods 16, 489–492 (2019).
Article CAS Google Scholar
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucl. Acids Res. 44, W160–165 (2016).
Article CAS Google Scholar

Download references

Acknowledgements

This study was performed using equipment of the Center for Precision Genome Editing and Genetic Technologies for Biomedicine of the Institute of Gene Biology RAS supported by Grant 075-15-2019-1661 from the Ministry of Science and Higher Education of the Russian Federation.

Funding

This study was supported by Russian Science Foundation (Grant 19–14-00016).

Author information

Authors and Affiliations

Institute of Gene Biology Russian Academy of Science, Moscow, Russia
Artem V. Luzhin, Arkadiy K. Golov, Alexey A. Gavrilov, Artem K. Velichko, Sergey V. Ulianov, Sergey V. Razin & Omar L. Kantidze
Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
Artem V. Luzhin, Alexey A. Gavrilov & Artem K. Velichko
Institute for Translational Medicine and Biotechnology, Sechenov First Moscow State Medical University, Moscow, Russia
Artem K. Velichko

Authors

Artem V. Luzhin
View author publications
You can also search for this author in PubMed Google Scholar
Arkadiy K. Golov
View author publications
You can also search for this author in PubMed Google Scholar
Alexey A. Gavrilov
View author publications
You can also search for this author in PubMed Google Scholar
Artem K. Velichko
View author publications
You can also search for this author in PubMed Google Scholar
Sergey V. Ulianov
View author publications
You can also search for this author in PubMed Google Scholar
Sergey V. Razin
View author publications
You can also search for this author in PubMed Google Scholar
Omar L. Kantidze
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.V.R., O.L.K., and S.V.U. conceived the study and coordinated research; A.V.L., A.K.G., A.A.G., A.K.V. wrote source code and performed bioinformatic analyses; A.V.L. and O.L.K. prepared figures; A.V.L. and O.L.K. wrote the manuscript; all authors reviewed the manuscript.

Corresponding authors

Correspondence to Sergey V. Razin or Omar L. Kantidze.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Luzhin, A.V., Golov, A.K., Gavrilov, A.A. et al. LASCA: loop and significant contact annotation pipeline. Sci Rep 11, 6361 (2021). https://doi.org/10.1038/s41598-021-85970-4

Download citation

Received: 09 November 2020
Accepted: 09 March 2021
Published: 18 March 2021
DOI: https://doi.org/10.1038/s41598-021-85970-4

This article is cited by

Comparative study on chromatin loop callers using Hi-C data reveals their effectiveness
- H. M. A. Mohit Chowdhury
- Terrance Boult
- Oluwatosin Oluwadare
BMC Bioinformatics (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.